Search engine for discovering works of Art, research articles, and books related to Art and Culture
ShareThis
Javascript must be enabled to continue!

Utility-Based Preference Training for Effective Synthetic Text Classification

View through CrossRef
High-quality synthetic text can mitigate annotation scarcity in text classification. However, standard preference optimization often produces samples that are fluent but weakly label-specific. We present Utility-weighted Direct Preference Optimization (U-DPO), a preference-optimization framework for class-conditional synthetic data generation. In U-DPO, a task-specific classifier provides a margin-based external score for each candidate generation, which is combined with an embedding-based internal similarity score to form an overall utility. These utilities are used (i) to mine preference pairs from multiple candidates per class and (ii) to weigh each DPO update by the utility gap between preferred and dispreferred samples. This design encourages the generator to concentrate on learning informative, label-discriminative preference comparisons rather than treating all pairs equally. Across two multiclass scientific-abstract benchmarks (arXiv and WOS-11967), U-DPO consistently improves downstream SciBERT classification accuracy compared with both vanilla synthetic generation and standard DPO fine-tuning, with gains up to 0.88 percentage points on arXiv and 0.83 percentage points on WOS-11967 depending on the generator. An additional GPT-4.5-based evaluation also indicates a higher mean quality score for U-DPO samples with reduced variance.
Title: Utility-Based Preference Training for Effective Synthetic Text Classification
Description:
High-quality synthetic text can mitigate annotation scarcity in text classification.
However, standard preference optimization often produces samples that are fluent but weakly label-specific.
We present Utility-weighted Direct Preference Optimization (U-DPO), a preference-optimization framework for class-conditional synthetic data generation.
In U-DPO, a task-specific classifier provides a margin-based external score for each candidate generation, which is combined with an embedding-based internal similarity score to form an overall utility.
These utilities are used (i) to mine preference pairs from multiple candidates per class and (ii) to weigh each DPO update by the utility gap between preferred and dispreferred samples.
This design encourages the generator to concentrate on learning informative, label-discriminative preference comparisons rather than treating all pairs equally.
Across two multiclass scientific-abstract benchmarks (arXiv and WOS-11967), U-DPO consistently improves downstream SciBERT classification accuracy compared with both vanilla synthetic generation and standard DPO fine-tuning, with gains up to 0.
88 percentage points on arXiv and 0.
83 percentage points on WOS-11967 depending on the generator.
An additional GPT-4.
5-based evaluation also indicates a higher mean quality score for U-DPO samples with reduced variance.

Related Results

Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
Sleep Habits and Occurrence of Lowback Pain among Craftsmen
<span style="color: #000000; font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10px; font-style: normal; font-variant-ligatures: normal; font-variant-caps: normal; ...
Bounds on the sum of broadcast domination number and strong metric dimension of graphs
Bounds on the sum of broadcast domination number and strong metric dimension of graphs
Let [Formula: see text] be a connected graph of order at least two with vertex set [Formula: see text]. For [Formula: see text], let [Formula: see text] denote the length of an [Fo...
ANALYSIS OF READING MATERIALS IN TEXTBOOK FOR GRADE XI SENIOR HIGH SCHOOL
ANALYSIS OF READING MATERIALS IN TEXTBOOK FOR GRADE XI SENIOR HIGH SCHOOL
This study aims to find out the GI and LD level, the text which has the highest GI and LD and what make the text has the highest GI and LD of Advanced Learning English 2 textbook. ...
A saturation problem in meshes
A saturation problem in meshes
Let [Formula: see text] and [Formula: see text] be graphs, where we view [Formula: see text] as the “host” graph and [Formula: see text] as a “forbidden” graph. A spanning subgraph...
E-Press and Oppress
E-Press and Oppress
From elephants to ABBA fans, silicon to hormone, the following discussion uses a new research method to look at printed text, motion pictures and a te...
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
On Flores Island, do "ape-men" still exist? https://www.sapiens.org/biology/flores-island-ape-men/
<span style="font-size:11pt"><span style="background:#f9f9f4"><span style="line-height:normal"><span style="font-family:Calibri,sans-serif"><b><spa...
When is R[θ] integrally closed?
When is R[θ] integrally closed?
Let [Formula: see text] be an integrally closed domain with quotient field [Formula: see text] and [Formula: see text] be an element of an integral domain containing [Formula: see ...

Back to Top