Working Papers
-
When to Match: A Cost-Balancing Principle for Dynamic Markets
Jie Liu, Hailun Zhang, Jiheng Zhang
Matching platforms, from ridesharing to food delivery to competitive gaming, face a fundamental operational dilemma: match agents immediately to minimize waiting costs, or delay to exploit the efficiency gains of thicker markets. Yet computing optimal policies is generally intractable, sophisticated algorithms often rely on restrictive distributional assumptions, and common heuristics lack worst-case performance guarantees. We formulate a versatile framework for multi-sided matching with general state-dependent cost structures and non-stationary arrival dynamics. Central to our approach is a cost-balancing principle: match when accumulated waiting cost reaches a calibrated proportion of instantaneous matching cost. This equilibrium condition emerges from fluid-limit analysis and motivates a simple, adaptive Cost-Balancing (CB) algorithm requiring no distributional assumptions. We prove that CB achieves a competitive ratio of $(1+\sqrt{\Gamma})$ under adversarial arrivals, where $\Gamma$ quantifies economies of scale, guaranteeing cost within a constant factor of the offline optimum. In contrast, standard greedy and threshold policies can incur unbounded costs in adversarial scenarios. We further establish a universal lower bound of $(\sqrt{5}+1)/2$ (the golden ratio), quantifying the fundamental price of uncertainty in online matching. Experiments on game matchmaking and real-world food delivery data demonstrate practical effectiveness, with CB consistently outperforming industry-standard heuristics.
-
Large-Scale LLM Inference with Heterogeneous Workloads: Prefill-Decode Contention and Asymptotically Optimal Control
Ruihan Lin, Zezhen Ding, Zean Han, Jiheng Zhang
Large Language Models (LLMs) are rapidly becoming critical infrastructure for enterprise applications, driving unprecedented demand for GPU-based inference services. A key operational challenge arises from the two-phase nature of LLM inference: a compute-intensive prefill phase that processes user input, followed by a memory-bound decode phase that generates output tokens. When these phases share GPU resources, prefill tasks throttle the processing speed of concurrent decodes, creating state-dependent contention. This contention is further complicated by workload heterogeneity, as different applications exhibit vastly different input and output lengths. We develop a stochastic control framework for scheduling heterogeneous LLM workloads across large GPU clusters. We formulate LLM inference as a multiclass many-server queueing network with state-dependent service rates, grounded in empirical iteration-time measurements. We analyze the fluid approximation of this system and solve steady-state linear programs that characterize optimal resource allocation. We design gate-and-route policies that regulate prefill admission and decode routing, and prove that they are asymptotically optimal in the many-GPU limit under both bundled and separate token-pricing schemes. We further extend the framework to incorporate Service Level Indicators (SLIs) such as latency and fairness, providing a general approach to constrained scheduling. Numerical experiments calibrated to empirical iteration-time data demonstrate that our policies outperform standard serving heuristics.
-
Automated Analysis for Operations Research: Achieving Data Efficiency and Reliability with Large Language Models
Zezhen Ding, Tianlong Chen, Jiheng Zhang
Automating Operations Research (OR) analysis, which involves formulating optimization models from natural language descriptions and generating executable solver code, could reduce reliance on scarce expert knowledge. While Large Language Models (LLMs) are natural candidates for this task, they suffer from probabilistic inconsistency, often requiring multiple sampling attempts to produce correct outputs. Moreover, existing learning-based methods face a data efficiency dilemma: high-quality training data is essential for reliability but prohibitively expensive to acquire at scale. We propose OR-R1, a data-efficient framework that integrates Supervised Fine-Tuning (SFT) with Test-Time Group Relative Policy Optimization (TGRPO). Our key innovation, TGRPO, extracts reliable training signals from unlabeled data by generating multiple candidate solutions and treating the solver-verified consensus as the ground truth, thereby effectively resolving the data efficiency dilemma. We provide theoretical guarantees for gradient convergence and demonstrate that this voting-based proxy reward consistently maximizes true solution accuracy. Practically, TGRPO significantly improves output consistency, narrowing the gap between single-attempt (Pass@1) and multiple-attempt (Pass@8) performance from 13% to 7%. Extensive experiments demonstrate that OR-R1 requires only 10% of the synthetic training data used by prior methods, yet achieves 67.7% average accuracy, surpassing baselines by 4.2%.
-
Learning to Bid in Non-Stationary Repeated First-Price Auctions
Zihao Hu, Xiaoyu Fan, Yuan Yao, Jiheng Zhang, Zhengyuan Zhou
First-price auctions have recently gained significant traction in digital advertising markets, exemplified by Google's transition from second-price to first-price auctions. Unlike in second-price auctions, where bidding one's private valuation is a dominant strategy, determining an optimal bidding strategy in first-price auctions is more complex. From a learning perspective, the learner (a specific bidder) can interact with the environment (other bidders, i.e., opponents) sequentially to infer their behaviors. Existing research often assumes specific environmental conditions and benchmarks performance against the best fixed policy (static benchmark). While this approach ensures strong learning guarantees, the static benchmark can deviate significantly from the optimal strategy in environments with even mild non-stationarity. To address such scenarios, a dynamic benchmark —representing the sum of the highest achievable rewards at each time step—offers a more suitable objective. However, achieving no-regret learning with respect to the dynamic benchmark requires additional constraints. By inspecting reward functions in online first-price auctions, we introduce two metrics to quantify the regularity of the sequence of opponents' highest bids, which serve as measures of non-stationarity. We provide a minimax-optimal characterization of the dynamic regret for the class of sequences of opponents' highest bids that satisfy either of these regularity constraints. Our main technical tool is the Optimistic Mirror Descent (OMD) framework with a novel optimism configuration, which is well-suited for achieving minimax-optimal dynamic regret rates in this context. We then use synthetic datasets to validate our theoretical guarantees and demonstrate that our methods outperform existing ones.
-
Staffing under Taylor's Law: A Unifying Framework for Bridging Square-root and Linear Safety Rules
L. Jeff Hong, Weihuan Huang, Jiheng Zhang, Xiaowei Zhang
Staffing rules serve as an essential management tool in service industries to attain target service levels. Traditionally, the square-root safety rule, based on the Poisson arrival assumption, has been commonly used. However, empirical findings suggest that arrival processes often exhibit an "over-dispersion" phenomenon, in which the variance of the arrival exceeds the mean. In this paper, we develop a new doubly stochastic Poisson process model to capture a significant dispersion scaling law, known as Taylor's law, showing that the variance is a power function of the mean. We further examine how over-dispersion affects staffing, providing a closed-form staffing formula to ensure a desired service level. Interestingly, the additional staffing level beyond the nominal load is a power function of the nominal load, with the power exponent lying between $1/2$ (the square-root safety rule) and $1$ (the linear safety rule), depending on the degree of over-dispersion. Extensive numerical experiments with both simulated and real arrival data show that our proposed model and staffing rules significantly outperform various alternatives.
-
Efficient Transfer Learning via Causal Bounds
Xueping Gong, Wei You, Jiheng Zhang
Transfer learning seeks to accelerate sequential decision-making by leveraging offline data from related agents. However, data from heterogeneous sources that differ in observed features, distributions, or unobserved confounders often render causal effects non-identifiable and bias naive estimators. We address this by forming ambiguity sets of structural causal models defined via integral constraints on their joint densities. Optimizing any causal effect over these sets leads to generally non-convex programs whose solutions tightly bound the range of possible effects under heterogeneity or confounding. To solve these programs efficiently, we develop a hit-and-run sampler that explores the entire ambiguity set and, when paired with a local optimization oracle, produces causal bound estimates that converge almost surely to the true limits. We further accommodate estimation error by relaxing the ambiguity set and exploit the Lipschitz continuity of causal effects to establish precise error propagation guarantees. These causal bounds are then embedded into bandit algorithms via arm elimination and truncated UCB indices, yielding optimal gap-dependent and minimax regret bounds. To handle estimation error, we also develop a safe algorithm for incorporating noisy causal bounds. In the contextual-bandit setting with function approximation, our method uses causal bounds to prune both the function class and the per-context action set, achieving matching upper and lower regret bounds with only logarithmic dependence on function-class complexity. Our analysis precisely characterizes when and how causal side-information accelerates online learning, and experiments on synthetic benchmarks confirm substantial regret reductions in data-scarce or confounded regimes.
Publications
2026
-
Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models
Xueping Gong, Wei You, Jiheng Zhang
Operations Research, 2026
We study contextual dynamic pricing, where a decision maker posts personalized prices based on observable contexts and receives binary purchase feedback indicating whether the customer's valuation exceeds the price. Each valuation is modeled as an unknown latent function of the context, corrupted by independent and identically distributed market noise from an unknown distribution. Relying only on Lipschitz continuity of the noise distribution and bounded valuations, we propose a minimax-optimal algorithm. To accommodate the unknown distribution, our method discretizes the relevant noise range to form a finite set of candidate prices, then applies layered data partitioning to obtain confidence bounds substantially tighter than those derived via the elliptical potential lemma. A key advantage is that estimation bias in the valuation function cancels when comparing upper confidence bounds, eliminating the need to know the Lipschitz constant. The framework extends beyond linear models to general function classes through offline regression oracles. Our regret analysis depends solely on the oracle's estimation error, typically governed by the statistical complexity of the class. These techniques yield a regret upper bound matching the minimax lower bound up to logarithmic factors. Furthermore, we refine these guarantees under additional structures—for example, linear valuation models, second-order smoothness, sparsity, and known noise distribution or observable valuations—and compare our bounds and assumptions with prior dynamic-pricing methods. Finally, numerical experiments corroborate the theory and show clear improvements over benchmark methods.
-
Insensitivity of Proportional Fairness in Critically Loaded Bandwidth Sharing Networks
Maria Vlasiou, Jiheng Zhang, Bert Zwart
Queueing Systems, 2026
Proportional fairness is a popular service allocation mechanism to describe and analyze the performance of data networks at flow level. Several authors have shown that the invariant distribution of networks operating according to proportional fairness admits a product form distribution under critical loading. They focus however on exponential job size distributions, leaving the case of general job size distributions as an open question. Motivated by this, we consider a network operating under proportional fairness where the job size distributions are of phase-type. We establish a heavy-traffic process limit theorem and show that the invariant distribution of the limit process is determined by the first moments of the job sizes only. Our analysis relies on a uniform convergence result for a fluid model.
2025
-
Data Sharing in the Metaverse with Key Abuse Resistance Based on Decentralized CP-ABE
Liang Zhang, Zhanrong Ou, Changhui Hu, Haibin Kan, Jiheng Zhang
IEEE Transactions on Computers, 74(3), 901-914, 2025
Data sharing is ubiquitous in the metaverse, which adopts blockchain as its foundation. Blockchain is employed because it enables data transparency, achieves tamper resistance, and supports smart contracts. However, securely sharing data based on blockchain necessitates further consideration. Ciphertext-policy attribute-based encryption (CP-ABE) is a promising primitive to provide confidentiality and fine-grained access control. Nonetheless, authority accountability and key abuse are critical issues that practical applications must address. Few studies have considered CP-ABE key confidentiality and authority accountability simultaneously. To our knowledge, we are the first to fill this gap by integrating non-interactive zero-knowledge (NIZK) proofs into CP-ABE keys and outsourcing the verification process to a smart contract. To meet the decentralization requirement, we incorporate a decentralized CP-ABE scheme into the proposed data sharing system. Additionally, we provide an implementation based on smart contract to determine whether an access control policy is satisfied by a set of CP-ABE keys. We also introduce an open incentive mechanism to encourage honest participation in data sharing. Hence, the key abuse issue is resolved through the NIZK proof and the incentive mechanism. We provide a theoretical analysis and conduct comprehensive experiments to demonstrate the feasibility and efficiency of the data sharing system.
-
MADRNet: Morphology-Aware Dual-path Reversible Network for Sperm Classification
Fan Yang, Jingzhang Sun, Honglan Huang, Liang Zhang, Jiheng Zhang
IEEE Journal of Biomedical and Health Informatics, 2025
Sperm morphology analysis plays a crucial role in the clinical diagnosis of male infertility. However, manual evaluation is inherently subjective, and inconsistencies in diagnostic criteria may compromise accuracy. Some existing sperm image classification models are introduced but requiring manual intervention. Most models lack of consideration of alignment between computational classification and WHO sperm morphology standards. To address these challenges, we propose an innovative morphology-aware dual-path reversible network (MADRNet) in designing our model. We integrate key biomarkers, such as head aspect ratio and acrosomal integrity, both of which are crucial for clinical sperm assessment, into the network. Particularly, the network utilizes a dual-path attention mechanism, incorporating both parallel spatial and channel attention, while embedding the acrosome anatomical constraint within the channel attention. To further enhance the alignment of our model with the WHO standards, we develop a dynamic loss function considering head aspect ratio constraint. Further, we employ a reversible architecture to enable the model to preserve more microscopic details while reducing GPU memory consumption. Experiments on the HuSHeM dataset demonstrate that the model achieves an accuracy of 96.3% and an F1 score of 96.8%.
-
Effort Decisions in Contests with Shared Attributes
Jiahao He, Jiheng Zhang, Rachel Q. Zhang
Mathematics of Operations Research, 2025
Individuals and organizations often face contests that require various skills, which can be developed through time and resource investments. Consider homogeneous contestants participating in multiple contests, each with multiple attributes and a reward for the winner or shared equally in case of a tie. Contestants can invest effort, at a cost, to enhance their skills in these attributes to maximize their expected net gain. Because contests may share some attributes while having unique ones, improving one attribute can impact winning chances differently across contests. This makes deciding how to allocate effort to each attribute a complex challenge. By reformulating the problem, we combine the effects of a contestant's efforts into their expected scores in the contests, simplifying the problem from many attributes to just the number of contests. We find that with two contests, contestants generally adopt a mixed strategy unless the contests are highly random. In less random scenarios, they tend to use more varied mixed strategies.
-
Improving Blockchain Consistency Bound by Assigning Weights to Random Blocks
Xueping Gong, Qing Zhang, Huizhong Li, Jiheng Zhang
Operations Research, 73(4), 2156-2176, 2025
Blockchains based on the celebrated Nakamoto consensus protocol have shown promise in several applications, including cryptocurrencies. However, these blockchains have inherent scalability limits caused by the protocol's consensus properties. In particular, the consistency property demonstrates a tight trade-off between block production speed and the system's security in terms of resisting adversarial attacks. This paper proposes a novel method, Ironclad, that improves blockchain consistency bound by assigning a different weight to randomly selected blocks. We apply our method to the original Nakamoto protocol and rigorously prove that such a combination can improve the consistency bound significantly by analyzing the fundamental consensus properties. Such an improvement enables a much faster block production rate than the original Nakamoto protocol under the same security guarantee with the same proportion of malicious mining power.
-
Efficient Graph Bandit Learning with Side-Observations and Switching Constraints
Xueping Gong, Jiheng Zhang
AAAI, 39(16), 16871-16879, 2025
This paper presents a novel framework for multi-armed bandit problems with side-observations and switching constraints, which arises in a range of real-world applications. To address the challenges of effectively utilizing graph-structured observations while adhering to graph constraints, we design graph-agnostic and graph-aware algorithms tailored to this new setting.
-
Unveiling Discrete Clues: Superior Healthcare Predictions for Rare Diseases
Chuang Zhao, Hui Tang, Jiheng Zhang, Xiaomeng Li
WWW, 1747-1758, 2025
Accurate healthcare prediction is essential for improving patient outcomes. Existing work primarily leverages advanced frameworks like attention or graph networks to capture the intricate collaborative signals in electronic health records. However, prediction for rare diseases remains challenging due to limited co-occurrence and inadequately tailored approaches. To address this issue, this paper proposes UDC, a novel method that unveils discrete clues to bridge consistent textual knowledge and collaborative signals within a unified semantic space, thereby enriching the representation semantics of rare diseases.
2024
-
Routing and Staffing in Customer Service Chat Systems with Generally Distributed Service and Patience Times
Zhenghua Long, Tolga Tezcan, Jiheng Zhang
Manufacturing and Service Operations Management, 26(5), 1674-1691, 2024
We study customer service chat (CSC) systems, in which agents can serve multiple customers simultaneously, with generally distributed service and patience times. The multitasking capability of agents introduces idiosyncratic challenges when making routing and staffing decisions. To determine the dynamic matching of arriving customers with available agents, we first formulate a routing linear program (LP) based on system primitives. Inspired by the optimal solution of the routing LP, we design a parsimonious dynamic routing policy that is independent of arrival rate and service capacity information. We also use the optimal solution to develop closed-form approximations for crucial performance metrics and show that a similar LP can be utilized to make staffing decisions.
-
The Generalized $c/\mu$ Rule for Queues with Heterogeneous Server Pools
Zhenghua Long, Hailun Zhang, Jiheng Zhang, Zhe George Zhang
Operations Research, 72(6), 2488-2506, 2024
We study the optimal control of queueing systems with heterogeneous server pools and a single customer class. The goal is to balance the holding cost of the queue with the operating costs of the server pools. We introduce the target-allocation policy, the $Gc/\mu$ rule, and the fixed priority policy for systems with general, convex, and concave cost functions, respectively. We also consider an extension to minimize operating costs and maintain a service-level target for customers waiting in the queue.
-
Technical Note -- Asymptotically Optimal Control of Omnichannel Service Systems with Pick-up Guarantees
Xuefeng Gao, Junfei Huang, Jiheng Zhang
Operations Research, 72(4), 1739-1748, 2024
Both walk-in and online customers appear in many service systems such as restaurants. Online customers are often given a target time for pick up, and the quality of the food/beverage can degrade if it is ready before the arrival of online customers. This distinctive feature brings essential difficulties in the analysis and control of such systems. We study the optimal control problem of such service systems based on a two-class single server queueing model and propose a nearly-optimal policy that keeps the server idle when the queue of online customers drops below a threshold and there are no walk-in customers.
-
On-Demand Ride-Matching in a Spatial Model with Abandonment and Cancellation
Guangju Wang, Hailun Zhang, Jiheng Zhang
Operations Research, 72(3), 1278-1297, 2024
We propose a spatial model to approximate pickup times based on the number of waiting passengers and idle drivers. We analyze the dynamics of passengers and drivers in a queueing model in which the platform can control the matching process by setting a threshold on the expected pickup time. Applying fluid approximations, we obtain accurate performance evaluations and an elegant optimality condition based on which we propose a policy that adapts to time-varying demand.
-
RL in Markov Games with Independent Function Approximation: Improved Sample Complexity Bound under the Local Access Model
Junyi Fan, Yuxuan Han, Jialin Zeng, Jianfeng Cai, Yang Wang, Yang Xiang, Jiheng Zhang
AISTATS, 238, 2035-2043, 2024
Efficiently learning equilibria with large state and action spaces in general-sum Markov games while overcoming the curse of multi-agency is a challenging problem. Recent works have attempted to solve this problem by employing independent linear function classes to approximate the marginal $Q$-value for each agent. However, existing sample complexity bounds under such a framework have a suboptimal dependency on the desired accuracy or the action space. In this work, we introduce a new algorithm, Lin-Confident-FTRL, for learning coarse correlated equilibria (CCE) with local access to the simulator.
-
Single-Trajectory Distributionally Robust Reinforcement Learning
Zhipeng Liang, Xiaoteng Ma, Jose Blanchet, Jun Yang, Jiheng Zhang, Zhengyuan Zhou
ICML, 2024
To mitigate the limitation that the classical reinforcement learning (RL) framework heavily relies on identical training and test environments, Distributionally Robust RL (DRRL) has been proposed to enhance performance across a range of environments, possibly including unknown test environments. As a price for robustness gain, DRRL involves optimizing over a set of distributions, which is inherently more challenging than optimizing over a fixed distribution in the non-robust case. In this paper, we design a first fully model-free DRRL algorithm, called distributionally robust $Q$-learning with single trajectory (DRQ).
2023
-
Blockchain Operations in the Presence of Security Concerns
Jiahao He, Guangyuan Zhang, Jiheng Zhang, Rachel Zhang
Manufacturing and Service Operations Management, 25(3), 1117-1135, 2023
A blockchain payment system, such as Bitcoin or Ethereum, validates electronic transactions and stores them in a chain of blocks without a central authority. Miners with computing power compete for the rights to create blocks according to a preset protocol, referred to as hashing or mining, and, in return, earn fees paid by users who submit transactions. Because of security concerns caused by decentralization, a transaction is confirmed after a number of additional blocks are subsequently extended to the block containing it. This confirmation latency introduces an intricate interplay between miners and users. This paper provides approximate system equilibria and studies optimal designs of a blockchain.
-
Achieving Near-optimal Regrets in Confounded Contextual Bandits
Xueping Gong, Jiheng Zhang
AAMAS, 2643-2645, 2023
The contextual bandit problem is a theoretically justified framework with wide applications in various fields. While the previous study on this problem usually requires independence between noise and contexts, our work considers a more sensible setting where the noise becomes a latent confounder that affects both contexts and rewards. To deal with the challenges brought by the confounder, we apply the dual instrumental variable regression, which can correctly identify the true reward function.
-
Debiasing Recommendation by Learning Identifiable Latent Confounders
Qing Zhang, Xiaoying Zhang, Yang Liu, Hongning Wang, Min Gao, Jiheng Zhang, Ruocheng Guo
KDD, 3353-3363, 2023
Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables that can affect both a user's exposure and feedback. In this work, we propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables to resolve the non-identification issue. The proposed iDCF is a general deconfounded recommendation framework that applies proximal causal inference to infer the unmeasured confounders and identify the counterfactual feedback with theoretical guarantees.
-
Optimal Contextual Bandits with Knapsacks under Realizability via Regression Oracles
Yuxuan Han, Jialin Zeng, Yang Wang, Yang Xiang, Jiheng Zhang
AISTATS, 206, 5011-5035, 2023
We study the stochastic contextual bandit with knapsacks (CBwK) problem, where each action, taken upon a context, not only leads to a random reward but also costs a random resource consumption in a vector form. The challenge is to maximize the total reward without violating the budget for each resource. Motivated by online regression oracles that have been successfully applied to contextual bandits, we propose the first universal and optimal algorithmic framework for CBwK by reducing it to online regression.
2022
-
Simulating Multi-Asset Classes Prices Using Wasserstein Generative Adversarial Network: A Study of Stocks, Futures and Cryptocurrency
Feng Han, Xiaojuan Ma, Jiheng Zhang
Journal of Risk and Financial Management, 15(1), 2022
Financial data are expensive and highly sensitive with limited access. We aim to generate abundant datasets given the original prices while preserving the original statistical features. We introduce the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) into the field of the stock market, futures market and cryptocurrency market.
-
Approximations and Optimal Control for State-Dependent Limited Processor Sharing Queues
Varun Gupta, Jiheng Zhang
Stochastic Systems, 12(2), 205-225, 2022
We study approximations and control of a processor sharing (PS) server where the service rate depends on the number of jobs occupying the server. The control of such a system is implemented by imposing a limit on the number of jobs that can share the server concurrently, with the rest of the jobs waiting in a first-in-first-out (FIFO) buffer. A desirable control scheme should strike the right balance between efficiency (operating at a high service rate) and parallelism (preventing small jobs from getting stuck behind large ones). We use the framework of heavy-traffic diffusion analysis to devise near optimal control heuristics for such a queueing system.
-
A Reduction from Linear Contextual Bandits Lower Bounds to Estimations Lower Bounds
Jiahao He, Jiheng Zhang, Rachel Zhang
ICML, 162, 8660-8677, 2022
Linear contextual bandits and their variants are usually solved using algorithms guided by parameter estimation. Cauchy-Schwarz inequality established that estimation errors dominate algorithm regrets, and thus, accurate estimators suffice to guarantee algorithms with low regrets. In this paper, we complete the reverse direction by establishing the necessity. In particular, we provide a generic transformation from algorithms for linear contextual bandits to estimators for linear models, and show that algorithm regrets dominate estimation errors of their induced estimators.
-
Private Streaming SCO in $\ell_p$-geometry with Applications in High Dimensional Online Decision Making
Yuxuan Han, Zhicong Liang, Zhipeng Liang, Yang Wang, Yuan Yao, Jiheng Zhang
ICML, 162, 8249-8279, 2022
Differentially private (DP) stochastic convex optimization (SCO) is ubiquitous in trustworthy machine learning algorithm design. This paper studies the DP-SCO problem with streaming data sampled from a distribution and arrives sequentially. We also consider the continual release model where parameters related to private information are updated and released upon each new data. We propose a private variant of online Frank-Wolfe algorithm with recursive gradients for variance reduction to update and reveal the parameters upon each data.
2021
-
Consensus Mechanism Design based on Structured Directed Acyclic Graphs
Jiahao He, Guangju Wang, Guangyuan Zhang, Jiheng Zhang
Blockchain: Research and Applications, 2(1), 2021
Capacity limit is a bottleneck for broader applications of blockchain systems. Scaling up capacity while preserving security and decentralization are major challenges in blockchain infrastructure design. In this paper, we design a proof of work-based mechanism by endowing directed acyclic graphs (DAG) with a novel structure so that peers can reach consensus at a large scale. At a high level, we break large blocks into smaller ones to improve utilization of broadcast network and embed a Nakamoto chain inside the DAG in a decent way to ensure security.
-
Generalized Linear Bandits with Local Differential Privacy
Yuxuan Han, Zhipeng Liang, Yang Wang, Jiheng Zhang
NeurIPS, 26511-26522, 2021
Contextual bandit algorithms are useful in personalized online decision-making. However, many applications such as personalized medicine and online advertising require the utilization of individual-specific information for effective learning, while user's data should remain private from the server due to privacy concerns. This motivates the introduction of local differential privacy (LDP) to contextual bandits. In this paper, we design LDP algorithms for stochastic generalized linear bandits to achieve the same regret bound as in non-privacy settings.
2020
-
Simple Policies with Provable Bounds for Managing Perishable Inventory
Hailun Zhang, Jiheng Zhang, Rachel Zhang
Production and Operations Management, 29(11), 2637-2650, 2020
The two fundamental decisions in the management of perishable inventory are how much new inventory to order and how much old inventory to clear before expiration. These decisions are known to be difficult due to the curse of dimensionality. We propose policies that are much simpler and easier to implement than existing ones in the literature. Our analysis revealed interesting insights into the circumstances under which perishability is negligible.
-
Online Demand Fulfillment under Limited Flexibility
Zhen Xu, Hailun Zhang, Jiheng Zhang, Rachel Zhang
Management Science, 66(10), 4667-4685, 2020
We study online demand fulfillment in a class of networks with limited flexibility and arbitrary numbers of resources and request types. We show analytically that such a network is both necessary and sufficient to guarantee a performance gap independent of the market size compared with networks with full flexibility, extending the previous literature from the long chains to more general sparse networks.
-
Dynamic Scheduling of Multiclass Many-Server Queues with Abandonment: The Generalized $c\mu/h$ Rule
Zhenghua Long, Nahum Shimkin, Hailun Zhang, Jiheng Zhang
Operations Research, 68(4), 1218-1230, 2020
We propose three scheduling policies to cope with any general cost functions and general patience-time distributions. Our first contribution is to introduce the target-allocation policy, which assigns higher priority to customer classes with larger deviation from the desired allocation of the service capacity and prove its optimality for any general queue-length cost functions and patience-time distributions.
2019
-
Virtual Allocation Policies for Many-server Queues with Abandonment
Zhenghua Long, Jiheng Zhang
Mathematical Methods of Operations Research, 90, 399-451, 2019
We study a multiclass many-server queueing system with renewal arrivals and generally distributed service and patience times under a nonpreemptive allocation policy. The status of the system is described by a pair of measure-valued processes to track the residual service and patience times of customers in each class. We establish fluid approximations and study the long-term behavior of the fluid model. The equilibrium state of the fluid model leads to a nonlinear program, which enables us to identify a lower bound for the long-run expected total holding and abandonment costs and design an allocation policy to achieve this lower bound. The optimality of the proposed policy is also demonstrated via numerical experiments.
-
A Note on Many-server Fluid Models with Time-varying Arrivals
Zhenghua Long, Jiheng Zhang
Probability in the Engineering and Informational Sciences, 33(3), 417-437, 2019
We extend the measure-valued fluid model, which tracks residuals of patience and service times, to allow for time-varying arrivals. The fluid model can be characterized by a one-dimensional convolution equation involving both the patience and service time distributions. We also make an interesting connection to the measure-valued fluid model tracking the elapsed waiting and service times.
-
Instantaneous Control of Brownian Motion with a Positive Lead Time
Zhen Xu, Jiheng Zhang, Rachel Zhang
Mathematics of Operations Research, 44(3), 943-965, 2019
Consider a storage system where the content is driven by a Brownian motion in the absence of control. At any time, one may increase or decrease the content at a cost proportional to the amount of adjustment. A decrease of the content takes effect immediately, while an increase is realized after a fixed lead time. Holding costs are incurred continuously over time and are a convex function of the content. The objective is to find a control policy that minimizes the expected present value of the total costs.
2018
-
Management of A Shared Spectrum Network in Wireless Communications
Shining Wu, Jiheng Zhang, Rachel Zhang
Operations Research, 66(4), 1119-1135, 2018
We consider a band of the electromagnetic spectrum with a finite number of identical channels shared by both licensed and unlicensed users. Such a network differs from most many-server, two-class queues in service systems, including call centers, because of the restrictions imposed on the unlicensed users to limit interference to the licensed users. We first approximate the key performance indicators under the asymptotic regime, which requires the analysis of both scaled and unscaled processes simultaneously using the averaging principle.
2017
-
Heavy-traffic Approximations for a Layered Network with Limited Resources
Angelos Aveklouris, Maria Vlasiou, Jiheng Zhang, Bert Zwart
Probability and Mathematical Statistics, 37(2), 497-532, 2017
Motivated by a web-server model, we present a queueing network consisting of two layers. The first layer incorporates the arrival of customers at a network of two single-server nodes. We assume that the interarrival and the service times have general distributions. At the second layer, active servers act as jobs that are served by a single server working at speed one in a processor-sharing fashion. We further assume that the degree of resource sharing is limited by choice, leading to a limited processor-sharing discipline. Our main result is a diffusion approximation for the process describing the number of customers in the system.
-
Refined Models for Efficiency-Driven Queues with Applications to Delay Announcements and Staffing
Junfei Huang, Avishai Mandelbaum, Hanqin Zhang, Jiheng Zhang
Operations Research, 65(5), 1380-1397, 2017
Data has revealed a noticeable impact of delay-time-related information on phone-customers; for example and somewhat surprisingly, delay announcements can abruptly increase the likelihood to abandon (hang up). Our starting point is that the latter phenomena can be used to support the control of queue lengths and delays. We do so by timing the announcements appropriately and determining the staffing levels accordingly.
2016
-
A Unified Approach to Diffusion Analysis of Queues with General Patience-time Distributions
Junfei Huang, Hanqin Zhang, Jiheng Zhang
Mathematics of Operations Research, 41(3), 1135-1160, 2016
We propose a unified approach to establishing diffusion approximations for queues with impatient customers within a general framework of scaling customer patience time. The approach consists of two steps. The first step is to show that the diffusion-scaled abandonment process is asymptotically close to a function of the diffusion-scaled queue length process under appropriate conditions. The second step is to construct a continuous mapping not only to characterize the system dynamics using the system primitives, but also to help verify the conditions needed in the first step.
2015
-
The Impact of Slow Steaming of Ocean Container Transport on Global Supply Chain
Chung-Yee Lee, Hau Lee, Jiheng Zhang
Transportation Research Part E: Logistics and Transportation Review, 76, 176-190, 2015
With a dominant volume of global transportation being conducted by sea, ocean container transport greatly impacts the global economy. Since sea vessels are drastically more fuel efficient when traveling at lower speeds, slow steaming has become a widely adopted practice to reduce bunker costs. However, this leads to a longer transportation time, which together with the unpredictability of the delay has been a big challenge. We propose a model to quantify the relationship among shipping time, bunker cost and delivery reliability. Our findings lead to a simple and implementable policy with a controlled cost and guaranteed delivery reliability.
2014
-
Routing and Staffing in Customer Service Chat Systems with Impatient Customers
Tolga Tezcan, Jiheng Zhang
Operations Research, 62(4), 943-956, 2014
We consider customer service chat (CSC) systems where customers can receive real time service from agents using an instant messaging (IM) application over the Internet. A unique feature of these systems is that agents can serve multiple customers simultaneously. The number of customers that an agent is serving determines the rate at which each customer assigned to that agent receives service. We consider the staffing problem in CSC systems with impatient customers where the objective is to minimize the number of agents while providing a certain service level.
-
Convergence to Equilibrium States for Fluid Models of Many-server Queues with Abandonment
Zhenghua Long, Jiheng Zhang
Operations Research Letters, 42(6-7), 388-393, 2014
Fluid models have become an important tool for the study of many-server queues with general service and patience time distributions. The equilibrium state of a fluid model has been revealed by Whitt (2006) and shown to yield reasonable approximations to the steady state of the original stochastic systems. However, it remains an open question whether the solution to a fluid model converges to the equilibrium state and under what condition. We show in this paper that the convergence holds under a mild condition.
-
Scaling and Modeling of Call Center Arrivals
Xiaowei Zhang, Jeff Hong, Jiheng Zhang
Winter Simulation Conference, 476-485, 2014
Even when they are known to be continuous, Poisson-process rate functions are sometimes specified as piecewise constant. To better approximate the unknown continuous rate function, we fit a piecewise-quadratic function. In addition to maintaining the rate's integral over each time interval, at each interval's end point we match the rates and their first derivatives.
2013
-
Staffing and Control of Instant Messaging Contact Centers
Jun Luo, Jiheng Zhang
Operations Research, 61(2), 328-343, 2013
In addition to traditional call centers, many companies have started building a new kind of customer contact center, in which agents communicate with customers via instant messaging (IM) over the Internet rather than phone calls. A distinctive feature of the service centers based on IM is that one agent can serve multiple customers in parallel. We choose to model such a center as a server pool consisting of many limited processor-sharing servers.
-
Fluid Model of Multi-server Queues with Abandonment
Jiheng Zhang
Queueing Systems, 73(2), 147-193, 2013
We study many-server queues with abandonment in which customers have general service and patience time distributions. The dynamics of the system are modeled using measure-valued processes, to keep track of the residual service and patience times of each customer. Deterministic fluid models are established to provide a first-order approximation for this model.
2012
-
Separation of timescales in a two-layered network
Maria Vlasiou, Jiheng Zhang, Bert Zwart, Rob van der Mei
International Teletraffic Congress, 185-192, 2012
We investigate a computer network consisting of two layers occurring in, for example, application servers. The first layer incorporates the arrival of jobs at a network of multi-server nodes, which we model as a many-server Jackson network. At the second layer, active servers at these nodes act now as customers who are served by a common CPU. Our main result shows a separation of time scales in heavy traffic.
2011
-
Diffusion Approximations of Limited Processor Sharing Queues in Heavy Traffic
Jiheng Zhang, J. G. Dai, Bert Zwart
Annals of Applied Probability, 21(2), 745-799, 2011
We consider a processor sharing queue where the number of jobs served at any time is limited to $K$, with the excess jobs waiting in a buffer. We use random counting measures on the positive axis to model this system. The limit of this measure-valued process is obtained under diffusion scaling and heavy traffic conditions. As a consequence, the limit of the system size process is proved to be a piece-wise reflected Brownian motion.
2009
-
Law of Large Number Limits of Limited Processor Sharing Queues
Jiheng Zhang, J. G. Dai, Bert Zwart
Mathematics of Operations Research, 34(4), 937-970, 2009
Motivated by applications in computer and communication systems, we consider a processor-sharing queue where the number of jobs served is not larger than $K$. We propose a measure-valued fluid model for this limited processor-sharing queue and show that there exists a unique associated fluid model solution. In addition, we show that this fluid model arises as the limit of a sequence of appropriately scaled processor-sharing queues.
2008
-
Steady State Approximations of Limited Processor Sharing Queues in Heavy Traffic
Jiheng Zhang, Bert Zwart
Queueing Systems, 60(3-4), 227-246, 2008
We investigate steady state properties of limited processor sharing queues in heavy traffic. Our analysis builds on previously obtained process limit theorems, and requires the interchange of steady state and heavy traffic limits, which are established by a coupling argument. The limit theorems yield explicit approximations of the steady state queue length and response time distribution in heavy traffic, of which the quality is supported by simulation experiments.
-
Minimax Optimality in Contextual Dynamic Pricing with General Valuation Models
Xueping Gong, Wei You, Jiheng Zhang
Operations Research, 2026
We study contextual dynamic pricing, where a decision maker posts personalized prices based on observable contexts and receives binary purchase feedback indicating whether the customer's valuation exceeds the price. Each valuation is modeled as an unknown latent function of the context, corrupted by independent and identically distributed market noise from an unknown distribution. Relying only on Lipschitz continuity of the noise distribution and bounded valuations, we propose a minimax-optimal algorithm. To accommodate the unknown distribution, our method discretizes the relevant noise range to form a finite set of candidate prices, then applies layered data partitioning to obtain confidence bounds substantially tighter than those derived via the elliptical potential lemma. A key advantage is that estimation bias in the valuation function cancels when comparing upper confidence bounds, eliminating the need to know the Lipschitz constant. The framework extends beyond linear models to general function classes through offline regression oracles. Our regret analysis depends solely on the oracle's estimation error, typically governed by the statistical complexity of the class. These techniques yield a regret upper bound matching the minimax lower bound up to logarithmic factors. Furthermore, we refine these guarantees under additional structures—for example, linear valuation models, second-order smoothness, sparsity, and known noise distribution or observable valuations—and compare our bounds and assumptions with prior dynamic-pricing methods. Finally, numerical experiments corroborate the theory and show clear improvements over benchmark methods.
-
Improving Blockchain Consistency Bound by Assigning Weights to Random Blocks
Xueping Gong, Qing Zhang, Huizhong Li, Jiheng Zhang
Operations Research, 73(4), 2156-2176, 2025
Blockchains based on the celebrated Nakamoto consensus protocol have shown promise in several applications, including cryptocurrencies. However, these blockchains have inherent scalability limits caused by the protocol's consensus properties. In particular, the consistency property demonstrates a tight trade-off between block production speed and the system's security in terms of resisting adversarial attacks. This paper proposes a novel method, Ironclad, that improves blockchain consistency bound by assigning a different weight to randomly selected blocks. We apply our method to the original Nakamoto protocol and rigorously prove that such a combination can improve the consistency bound significantly by analyzing the fundamental consensus properties. Such an improvement enables a much faster block production rate than the original Nakamoto protocol under the same security guarantee with the same proportion of malicious mining power.
-
The Generalized $c/\mu$ Rule for Queues with Heterogeneous Server Pools
Zhenghua Long, Hailun Zhang, Jiheng Zhang, Zhe George Zhang
Operations Research, 72(6), 2488-2506, 2024
We study the optimal control of queueing systems with heterogeneous server pools and a single customer class. The goal is to balance the holding cost of the queue with the operating costs of the server pools. We introduce the target-allocation policy, the $Gc/\mu$ rule, and the fixed priority policy for systems with general, convex, and concave cost functions, respectively. We also consider an extension to minimize operating costs and maintain a service-level target for customers waiting in the queue.
-
Technical Note -- Asymptotically Optimal Control of Omnichannel Service Systems with Pick-up Guarantees
Xuefeng Gao, Junfei Huang, Jiheng Zhang
Operations Research, 72(4), 1739-1748, 2024
Both walk-in and online customers appear in many service systems such as restaurants. Online customers are often given a target time for pick up, and the quality of the food/beverage can degrade if it is ready before the arrival of online customers. This distinctive feature brings essential difficulties in the analysis and control of such systems. We study the optimal control problem of such service systems based on a two-class single server queueing model and propose a nearly-optimal policy that keeps the server idle when the queue of online customers drops below a threshold and there are no walk-in customers.
-
On-Demand Ride-Matching in a Spatial Model with Abandonment and Cancellation
Guangju Wang, Hailun Zhang, Jiheng Zhang
Operations Research, 72(3), 1278-1297, 2024
We propose a spatial model to approximate pickup times based on the number of waiting passengers and idle drivers. We analyze the dynamics of passengers and drivers in a queueing model in which the platform can control the matching process by setting a threshold on the expected pickup time. Applying fluid approximations, we obtain accurate performance evaluations and an elegant optimality condition based on which we propose a policy that adapts to time-varying demand.
-
Dynamic Scheduling of Multiclass Many-Server Queues with Abandonment: The Generalized $c\mu/h$ Rule
Zhenghua Long, Nahum Shimkin, Hailun Zhang, Jiheng Zhang
Operations Research, 68(4), 1218-1230, 2020
We propose three scheduling policies to cope with any general cost functions and general patience-time distributions. Our first contribution is to introduce the target-allocation policy, which assigns higher priority to customer classes with larger deviation from the desired allocation of the service capacity and prove its optimality for any general queue-length cost functions and patience-time distributions.
-
Management of A Shared Spectrum Network in Wireless Communications
Shining Wu, Jiheng Zhang, Rachel Zhang
Operations Research, 66(4), 1119-1135, 2018
We consider a band of the electromagnetic spectrum with a finite number of identical channels shared by both licensed and unlicensed users. Such a network differs from most many-server, two-class queues in service systems, including call centers, because of the restrictions imposed on the unlicensed users to limit interference to the licensed users. We first approximate the key performance indicators under the asymptotic regime, which requires the analysis of both scaled and unscaled processes simultaneously using the averaging principle.
-
Refined Models for Efficiency-Driven Queues with Applications to Delay Announcements and Staffing
Junfei Huang, Avishai Mandelbaum, Hanqin Zhang, Jiheng Zhang
Operations Research, 65(5), 1380-1397, 2017
Data has revealed a noticeable impact of delay-time-related information on phone-customers; for example and somewhat surprisingly, delay announcements can abruptly increase the likelihood to abandon (hang up). Our starting point is that the latter phenomena can be used to support the control of queue lengths and delays. We do so by timing the announcements appropriately and determining the staffing levels accordingly.
-
Routing and Staffing in Customer Service Chat Systems with Impatient Customers
Tolga Tezcan, Jiheng Zhang
Operations Research, 62(4), 943-956, 2014
We consider customer service chat (CSC) systems where customers can receive real time service from agents using an instant messaging (IM) application over the Internet. A unique feature of these systems is that agents can serve multiple customers simultaneously. The number of customers that an agent is serving determines the rate at which each customer assigned to that agent receives service. We consider the staffing problem in CSC systems with impatient customers where the objective is to minimize the number of agents while providing a certain service level.
-
Staffing and Control of Instant Messaging Contact Centers
Jun Luo, Jiheng Zhang
Operations Research, 61(2), 328-343, 2013
In addition to traditional call centers, many companies have started building a new kind of customer contact center, in which agents communicate with customers via instant messaging (IM) over the Internet rather than phone calls. A distinctive feature of the service centers based on IM is that one agent can serve multiple customers in parallel. We choose to model such a center as a server pool consisting of many limited processor-sharing servers.
-
Effort Decisions in Contests with Shared Attributes
Jiahao He, Jiheng Zhang, Rachel Q. Zhang
Mathematics of Operations Research, 2025
Individuals and organizations often face contests that require various skills, which can be developed through time and resource investments. Consider homogeneous contestants participating in multiple contests, each with multiple attributes and a reward for the winner or shared equally in case of a tie. Contestants can invest effort, at a cost, to enhance their skills in these attributes to maximize their expected net gain. Because contests may share some attributes while having unique ones, improving one attribute can impact winning chances differently across contests. This makes deciding how to allocate effort to each attribute a complex challenge. By reformulating the problem, we combine the effects of a contestant's efforts into their expected scores in the contests, simplifying the problem from many attributes to just the number of contests. We find that with two contests, contestants generally adopt a mixed strategy unless the contests are highly random. In less random scenarios, they tend to use more varied mixed strategies.
-
Instantaneous Control of Brownian Motion with a Positive Lead Time
Zhen Xu, Jiheng Zhang, Rachel Zhang
Mathematics of Operations Research, 44(3), 943-965, 2019
Consider a storage system where the content is driven by a Brownian motion in the absence of control. At any time, one may increase or decrease the content at a cost proportional to the amount of adjustment. A decrease of the content takes effect immediately, while an increase is realized after a fixed lead time. Holding costs are incurred continuously over time and are a convex function of the content. The objective is to find a control policy that minimizes the expected present value of the total costs.
-
A Unified Approach to Diffusion Analysis of Queues with General Patience-time Distributions
Junfei Huang, Hanqin Zhang, Jiheng Zhang
Mathematics of Operations Research, 41(3), 1135-1160, 2016
We propose a unified approach to establishing diffusion approximations for queues with impatient customers within a general framework of scaling customer patience time. The approach consists of two steps. The first step is to show that the diffusion-scaled abandonment process is asymptotically close to a function of the diffusion-scaled queue length process under appropriate conditions. The second step is to construct a continuous mapping not only to characterize the system dynamics using the system primitives, but also to help verify the conditions needed in the first step.
-
Law of Large Number Limits of Limited Processor Sharing Queues
Jiheng Zhang, J. G. Dai, Bert Zwart
Mathematics of Operations Research, 34(4), 937-970, 2009
Motivated by applications in computer and communication systems, we consider a processor-sharing queue where the number of jobs served is not larger than $K$. We propose a measure-valued fluid model for this limited processor-sharing queue and show that there exists a unique associated fluid model solution. In addition, we show that this fluid model arises as the limit of a sequence of appropriately scaled processor-sharing queues.
-
Single-Trajectory Distributionally Robust Reinforcement Learning
Zhipeng Liang, Xiaoteng Ma, Jose Blanchet, Jun Yang, Jiheng Zhang, Zhengyuan Zhou
ICML, 2024
To mitigate the limitation that the classical reinforcement learning (RL) framework heavily relies on identical training and test environments, Distributionally Robust RL (DRRL) has been proposed to enhance performance across a range of environments, possibly including unknown test environments. As a price for robustness gain, DRRL involves optimizing over a set of distributions, which is inherently more challenging than optimizing over a fixed distribution in the non-robust case. In this paper, we design a first fully model-free DRRL algorithm, called distributionally robust $Q$-learning with single trajectory (DRQ).
-
A Reduction from Linear Contextual Bandits Lower Bounds to Estimations Lower Bounds
Jiahao He, Jiheng Zhang, Rachel Zhang
ICML, 162, 8660-8677, 2022
Linear contextual bandits and their variants are usually solved using algorithms guided by parameter estimation. Cauchy-Schwarz inequality established that estimation errors dominate algorithm regrets, and thus, accurate estimators suffice to guarantee algorithms with low regrets. In this paper, we complete the reverse direction by establishing the necessity. In particular, we provide a generic transformation from algorithms for linear contextual bandits to estimators for linear models, and show that algorithm regrets dominate estimation errors of their induced estimators.
-
Private Streaming SCO in $\ell_p$-geometry with Applications in High Dimensional Online Decision Making
Yuxuan Han, Zhicong Liang, Zhipeng Liang, Yang Wang, Yuan Yao, Jiheng Zhang
ICML, 162, 8249-8279, 2022
Differentially private (DP) stochastic convex optimization (SCO) is ubiquitous in trustworthy machine learning algorithm design. This paper studies the DP-SCO problem with streaming data sampled from a distribution and arrives sequentially. We also consider the continual release model where parameters related to private information are updated and released upon each new data. We propose a private variant of online Frank-Wolfe algorithm with recursive gradients for variance reduction to update and reveal the parameters upon each data.
-
Insensitivity of Proportional Fairness in Critically Loaded Bandwidth Sharing Networks
Maria Vlasiou, Jiheng Zhang, Bert Zwart
Queueing Systems, 2026
Proportional fairness is a popular service allocation mechanism to describe and analyze the performance of data networks at flow level. Several authors have shown that the invariant distribution of networks operating according to proportional fairness admits a product form distribution under critical loading. They focus however on exponential job size distributions, leaving the case of general job size distributions as an open question. Motivated by this, we consider a network operating under proportional fairness where the job size distributions are of phase-type. We establish a heavy-traffic process limit theorem and show that the invariant distribution of the limit process is determined by the first moments of the job sizes only. Our analysis relies on a uniform convergence result for a fluid model.
-
Fluid Model of Multi-server Queues with Abandonment
Jiheng Zhang
Queueing Systems, 73(2), 147-193, 2013
We study many-server queues with abandonment in which customers have general service and patience time distributions. The dynamics of the system are modeled using measure-valued processes, to keep track of the residual service and patience times of each customer. Deterministic fluid models are established to provide a first-order approximation for this model.
-
Steady State Approximations of Limited Processor Sharing Queues in Heavy Traffic
Jiheng Zhang, Bert Zwart
Queueing Systems, 60(3-4), 227-246, 2008
We investigate steady state properties of limited processor sharing queues in heavy traffic. Our analysis builds on previously obtained process limit theorems, and requires the interchange of steady state and heavy traffic limits, which are established by a coupling argument. The limit theorems yield explicit approximations of the steady state queue length and response time distribution in heavy traffic, of which the quality is supported by simulation experiments.
-
RL in Markov Games with Independent Function Approximation: Improved Sample Complexity Bound under the Local Access Model
Junyi Fan, Yuxuan Han, Jialin Zeng, Jianfeng Cai, Yang Wang, Yang Xiang, Jiheng Zhang
AISTATS, 238, 2035-2043, 2024
Efficiently learning equilibria with large state and action spaces in general-sum Markov games while overcoming the curse of multi-agency is a challenging problem. Recent works have attempted to solve this problem by employing independent linear function classes to approximate the marginal $Q$-value for each agent. However, existing sample complexity bounds under such a framework have a suboptimal dependency on the desired accuracy or the action space. In this work, we introduce a new algorithm, Lin-Confident-FTRL, for learning coarse correlated equilibria (CCE) with local access to the simulator.
-
Optimal Contextual Bandits with Knapsacks under Realizability via Regression Oracles
Yuxuan Han, Jialin Zeng, Yang Wang, Yang Xiang, Jiheng Zhang
AISTATS, 206, 5011-5035, 2023
We study the stochastic contextual bandit with knapsacks (CBwK) problem, where each action, taken upon a context, not only leads to a random reward but also costs a random resource consumption in a vector form. The challenge is to maximize the total reward without violating the budget for each resource. Motivated by online regression oracles that have been successfully applied to contextual bandits, we propose the first universal and optimal algorithmic framework for CBwK by reducing it to online regression.
-
Routing and Staffing in Customer Service Chat Systems with Generally Distributed Service and Patience Times
Zhenghua Long, Tolga Tezcan, Jiheng Zhang
Manufacturing and Service Operations Management, 26(5), 1674-1691, 2024
We study customer service chat (CSC) systems, in which agents can serve multiple customers simultaneously, with generally distributed service and patience times. The multitasking capability of agents introduces idiosyncratic challenges when making routing and staffing decisions. To determine the dynamic matching of arriving customers with available agents, we first formulate a routing linear program (LP) based on system primitives. Inspired by the optimal solution of the routing LP, we design a parsimonious dynamic routing policy that is independent of arrival rate and service capacity information. We also use the optimal solution to develop closed-form approximations for crucial performance metrics and show that a similar LP can be utilized to make staffing decisions.
-
Blockchain Operations in the Presence of Security Concerns
Jiahao He, Guangyuan Zhang, Jiheng Zhang, Rachel Zhang
Manufacturing and Service Operations Management, 25(3), 1117-1135, 2023
A blockchain payment system, such as Bitcoin or Ethereum, validates electronic transactions and stores them in a chain of blocks without a central authority. Miners with computing power compete for the rights to create blocks according to a preset protocol, referred to as hashing or mining, and, in return, earn fees paid by users who submit transactions. Because of security concerns caused by decentralization, a transaction is confirmed after a number of additional blocks are subsequently extended to the block containing it. This confirmation latency introduces an intricate interplay between miners and users. This paper provides approximate system equilibria and studies optimal designs of a blockchain.
-
Efficient Graph Bandit Learning with Side-Observations and Switching Constraints
Xueping Gong, Jiheng Zhang
AAAI, 39(16), 16871-16879, 2025
This paper presents a novel framework for multi-armed bandit problems with side-observations and switching constraints, which arises in a range of real-world applications. To address the challenges of effectively utilizing graph-structured observations while adhering to graph constraints, we design graph-agnostic and graph-aware algorithms tailored to this new setting.
-
Achieving Near-optimal Regrets in Confounded Contextual Bandits
Xueping Gong, Jiheng Zhang
AAMAS, 2643-2645, 2023
The contextual bandit problem is a theoretically justified framework with wide applications in various fields. While the previous study on this problem usually requires independence between noise and contexts, our work considers a more sensible setting where the noise becomes a latent confounder that affects both contexts and rewards. To deal with the challenges brought by the confounder, we apply the dual instrumental variable regression, which can correctly identify the true reward function.
-
Diffusion Approximations of Limited Processor Sharing Queues in Heavy Traffic
Jiheng Zhang, J. G. Dai, Bert Zwart
Annals of Applied Probability, 21(2), 745-799, 2011
We consider a processor sharing queue where the number of jobs served at any time is limited to $K$, with the excess jobs waiting in a buffer. We use random counting measures on the positive axis to model this system. The limit of this measure-valued process is obtained under diffusion scaling and heavy traffic conditions. As a consequence, the limit of the system size process is proved to be a piece-wise reflected Brownian motion.
-
Consensus Mechanism Design based on Structured Directed Acyclic Graphs
Jiahao He, Guangju Wang, Guangyuan Zhang, Jiheng Zhang
Blockchain: Research and Applications, 2(1), 2021
Capacity limit is a bottleneck for broader applications of blockchain systems. Scaling up capacity while preserving security and decentralization are major challenges in blockchain infrastructure design. In this paper, we design a proof of work-based mechanism by endowing directed acyclic graphs (DAG) with a novel structure so that peers can reach consensus at a large scale. At a high level, we break large blocks into smaller ones to improve utilization of broadcast network and embed a Nakamoto chain inside the DAG in a decent way to ensure security.
-
MADRNet: Morphology-Aware Dual-path Reversible Network for Sperm Classification
Fan Yang, Jingzhang Sun, Honglan Huang, Liang Zhang, Jiheng Zhang
IEEE Journal of Biomedical and Health Informatics, 2025
Sperm morphology analysis plays a crucial role in the clinical diagnosis of male infertility. However, manual evaluation is inherently subjective, and inconsistencies in diagnostic criteria may compromise accuracy. Some existing sperm image classification models are introduced but requiring manual intervention. Most models lack of consideration of alignment between computational classification and WHO sperm morphology standards. To address these challenges, we propose an innovative morphology-aware dual-path reversible network (MADRNet) in designing our model. We integrate key biomarkers, such as head aspect ratio and acrosomal integrity, both of which are crucial for clinical sperm assessment, into the network. Particularly, the network utilizes a dual-path attention mechanism, incorporating both parallel spatial and channel attention, while embedding the acrosome anatomical constraint within the channel attention. To further enhance the alignment of our model with the WHO standards, we develop a dynamic loss function considering head aspect ratio constraint. Further, we employ a reversible architecture to enable the model to preserve more microscopic details while reducing GPU memory consumption. Experiments on the HuSHeM dataset demonstrate that the model achieves an accuracy of 96.3% and an F1 score of 96.8%.
-
Data Sharing in the Metaverse with Key Abuse Resistance Based on Decentralized CP-ABE
Liang Zhang, Zhanrong Ou, Changhui Hu, Haibin Kan, Jiheng Zhang
IEEE Transactions on Computers, 74(3), 901-914, 2025
Data sharing is ubiquitous in the metaverse, which adopts blockchain as its foundation. Blockchain is employed because it enables data transparency, achieves tamper resistance, and supports smart contracts. However, securely sharing data based on blockchain necessitates further consideration. Ciphertext-policy attribute-based encryption (CP-ABE) is a promising primitive to provide confidentiality and fine-grained access control. Nonetheless, authority accountability and key abuse are critical issues that practical applications must address. Few studies have considered CP-ABE key confidentiality and authority accountability simultaneously. To our knowledge, we are the first to fill this gap by integrating non-interactive zero-knowledge (NIZK) proofs into CP-ABE keys and outsourcing the verification process to a smart contract. To meet the decentralization requirement, we incorporate a decentralized CP-ABE scheme into the proposed data sharing system. Additionally, we provide an implementation based on smart contract to determine whether an access control policy is satisfied by a set of CP-ABE keys. We also introduce an open incentive mechanism to encourage honest participation in data sharing. Hence, the key abuse issue is resolved through the NIZK proof and the incentive mechanism. We provide a theoretical analysis and conduct comprehensive experiments to demonstrate the feasibility and efficiency of the data sharing system.
-
Separation of timescales in a two-layered network
Maria Vlasiou, Jiheng Zhang, Bert Zwart, Rob van der Mei
International Teletraffic Congress, 185-192, 2012
We investigate a computer network consisting of two layers occurring in, for example, application servers. The first layer incorporates the arrival of jobs at a network of multi-server nodes, which we model as a many-server Jackson network. At the second layer, active servers at these nodes act now as customers who are served by a common CPU. Our main result shows a separation of time scales in heavy traffic.
-
Simulating Multi-Asset Classes Prices Using Wasserstein Generative Adversarial Network: A Study of Stocks, Futures and Cryptocurrency
Feng Han, Xiaojuan Ma, Jiheng Zhang
Journal of Risk and Financial Management, 15(1), 2022
Financial data are expensive and highly sensitive with limited access. We aim to generate abundant datasets given the original prices while preserving the original statistical features. We introduce the Wasserstein Generative Adversarial Network with Gradient Penalty (WGAN-GP) into the field of the stock market, futures market and cryptocurrency market.
-
Debiasing Recommendation by Learning Identifiable Latent Confounders
Qing Zhang, Xiaoying Zhang, Yang Liu, Hongning Wang, Min Gao, Jiheng Zhang, Ruocheng Guo
KDD, 3353-3363, 2023
Recommendation systems aim to predict users' feedback on items not exposed to them. Confounding bias arises due to the presence of unmeasured variables that can affect both a user's exposure and feedback. In this work, we propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables to resolve the non-identification issue. The proposed iDCF is a general deconfounded recommendation framework that applies proximal causal inference to infer the unmeasured confounders and identify the counterfactual feedback with theoretical guarantees.
-
Online Demand Fulfillment under Limited Flexibility
Zhen Xu, Hailun Zhang, Jiheng Zhang, Rachel Zhang
Management Science, 66(10), 4667-4685, 2020
We study online demand fulfillment in a class of networks with limited flexibility and arbitrary numbers of resources and request types. We show analytically that such a network is both necessary and sufficient to guarantee a performance gap independent of the market size compared with networks with full flexibility, extending the previous literature from the long chains to more general sparse networks.
-
Virtual Allocation Policies for Many-server Queues with Abandonment
Zhenghua Long, Jiheng Zhang
Mathematical Methods of Operations Research, 90, 399-451, 2019
We study a multiclass many-server queueing system with renewal arrivals and generally distributed service and patience times under a nonpreemptive allocation policy. The status of the system is described by a pair of measure-valued processes to track the residual service and patience times of customers in each class. We establish fluid approximations and study the long-term behavior of the fluid model. The equilibrium state of the fluid model leads to a nonlinear program, which enables us to identify a lower bound for the long-run expected total holding and abandonment costs and design an allocation policy to achieve this lower bound. The optimality of the proposed policy is also demonstrated via numerical experiments.
-
Generalized Linear Bandits with Local Differential Privacy
Yuxuan Han, Zhipeng Liang, Yang Wang, Jiheng Zhang
NeurIPS, 26511-26522, 2021
Contextual bandit algorithms are useful in personalized online decision-making. However, many applications such as personalized medicine and online advertising require the utilization of individual-specific information for effective learning, while user's data should remain private from the server due to privacy concerns. This motivates the introduction of local differential privacy (LDP) to contextual bandits. In this paper, we design LDP algorithms for stochastic generalized linear bandits to achieve the same regret bound as in non-privacy settings.
-
Convergence to Equilibrium States for Fluid Models of Many-server Queues with Abandonment
Zhenghua Long, Jiheng Zhang
Operations Research Letters, 42(6-7), 388-393, 2014
Fluid models have become an important tool for the study of many-server queues with general service and patience time distributions. The equilibrium state of a fluid model has been revealed by Whitt (2006) and shown to yield reasonable approximations to the steady state of the original stochastic systems. However, it remains an open question whether the solution to a fluid model converges to the equilibrium state and under what condition. We show in this paper that the convergence holds under a mild condition.
-
Heavy-traffic Approximations for a Layered Network with Limited Resources
Angelos Aveklouris, Maria Vlasiou, Jiheng Zhang, Bert Zwart
Probability and Mathematical Statistics, 37(2), 497-532, 2017
Motivated by a web-server model, we present a queueing network consisting of two layers. The first layer incorporates the arrival of customers at a network of two single-server nodes. We assume that the interarrival and the service times have general distributions. At the second layer, active servers act as jobs that are served by a single server working at speed one in a processor-sharing fashion. We further assume that the degree of resource sharing is limited by choice, leading to a limited processor-sharing discipline. Our main result is a diffusion approximation for the process describing the number of customers in the system.
-
A Note on Many-server Fluid Models with Time-varying Arrivals
Zhenghua Long, Jiheng Zhang
Probability in the Engineering and Informational Sciences, 33(3), 417-437, 2019
We extend the measure-valued fluid model, which tracks residuals of patience and service times, to allow for time-varying arrivals. The fluid model can be characterized by a one-dimensional convolution equation involving both the patience and service time distributions. We also make an interesting connection to the measure-valued fluid model tracking the elapsed waiting and service times.
-
Simple Policies with Provable Bounds for Managing Perishable Inventory
Hailun Zhang, Jiheng Zhang, Rachel Zhang
Production and Operations Management, 29(11), 2637-2650, 2020
The two fundamental decisions in the management of perishable inventory are how much new inventory to order and how much old inventory to clear before expiration. These decisions are known to be difficult due to the curse of dimensionality. We propose policies that are much simpler and easier to implement than existing ones in the literature. Our analysis revealed interesting insights into the circumstances under which perishability is negligible.
-
Approximations and Optimal Control for State-Dependent Limited Processor Sharing Queues
Varun Gupta, Jiheng Zhang
Stochastic Systems, 12(2), 205-225, 2022
We study approximations and control of a processor sharing (PS) server where the service rate depends on the number of jobs occupying the server. The control of such a system is implemented by imposing a limit on the number of jobs that can share the server concurrently, with the rest of the jobs waiting in a first-in-first-out (FIFO) buffer. A desirable control scheme should strike the right balance between efficiency (operating at a high service rate) and parallelism (preventing small jobs from getting stuck behind large ones). We use the framework of heavy-traffic diffusion analysis to devise near optimal control heuristics for such a queueing system.
-
The Impact of Slow Steaming of Ocean Container Transport on Global Supply Chain
Chung-Yee Lee, Hau Lee, Jiheng Zhang
Transportation Research Part E: Logistics and Transportation Review, 76, 176-190, 2015
With a dominant volume of global transportation being conducted by sea, ocean container transport greatly impacts the global economy. Since sea vessels are drastically more fuel efficient when traveling at lower speeds, slow steaming has become a widely adopted practice to reduce bunker costs. However, this leads to a longer transportation time, which together with the unpredictability of the delay has been a big challenge. We propose a model to quantify the relationship among shipping time, bunker cost and delivery reliability. Our findings lead to a simple and implementable policy with a controlled cost and guaranteed delivery reliability.
-
Unveiling Discrete Clues: Superior Healthcare Predictions for Rare Diseases
Chuang Zhao, Hui Tang, Jiheng Zhang, Xiaomeng Li
WWW, 1747-1758, 2025
Accurate healthcare prediction is essential for improving patient outcomes. Existing work primarily leverages advanced frameworks like attention or graph networks to capture the intricate collaborative signals in electronic health records. However, prediction for rare diseases remains challenging due to limited co-occurrence and inadequately tailored approaches. To address this issue, this paper proposes UDC, a novel method that unveils discrete clues to bridge consistent textual knowledge and collaborative signals within a unified semantic space, thereby enriching the representation semantics of rare diseases.
-
Scaling and Modeling of Call Center Arrivals
Xiaowei Zhang, Jeff Hong, Jiheng Zhang
Winter Simulation Conference, 476-485, 2014
Even when they are known to be continuous, Poisson-process rate functions are sometimes specified as piecewise constant. To better approximate the unknown continuous rate function, we fit a piecewise-quadratic function. In addition to maintaining the rate's integral over each time interval, at each interval's end point we match the rates and their first derivatives.