Emergent Mind

Abstract

Deciphering the oracle bone script plays a significant role in Chinese archaeology and philology. However, it is significantly challenging due to the scarcity of oracle character images. To overcome this issue, we propose Diff-Oracle, based on diffusion models (DMs), to generate sufficient controllable oracle characters. In contrast to most DMs that rely on text prompts, we incorporate a style encoder to control style information during the generation process. This encoder extracts style prompts from existing oracle character images, where style details are converted from a CLIP model into a text embedding format. Inspired by ControlNet, we introduce a content encoder to capture desired content information from content images, ensuring the fidelity of character glyphs. To train Diff-Oracle effectively, we propose to obtain pixel-level paired oracle character images (i.e., style and content images) by a pre-trained image-to-image translation model. Extensive qualitative and quantitative experiments conducted on two benchmark datasets, Oracle-241 and OBC306, demonstrate that our Diff-Oracle outperforms existing generative methods in terms of image generation, further enhancing recognition accuracy. Source codes will be available.

We're not able to analyze this paper right now due to high demand.

Please check back later (sorry!).

Generate a summary of this paper on our Pro plan:

We ran into a problem analyzing this paper.

Newsletter

Get summaries of trending comp sci papers delivered straight to your inbox:

Unsubscribe anytime.