Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 24 tok/s Pro
GPT-5 High 25 tok/s Pro
GPT-4o 113 tok/s Pro
Kimi K2 216 tok/s Pro
GPT OSS 120B 428 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

Warehousing Web Data (0705.1456v1)

Published 10 May 2007 in cs.DB

Abstract: In a data warehousing process, mastering the data preparation phase allows substantial gains in terms of time and performance when performing multidimensional analysis or using data mining algorithms. Furthermore, a data warehouse can require external data. The web is a prevalent data source in this context. In this paper, we propose a modeling process for integrating diverse and heterogeneous (so-called multiform) data into a unified format. Furthermore, the very schema definition provides first-rate metadata in our data warehousing context. At the conceptual level, a complex object is represented in UML. Our logical model is an XML schema that can be described with a DTD or the XML-Schema language. Eventually, we have designed a Java prototype that transforms our multiform input data into XML documents representing our physical model. Then, the XML documents we obtain are mapped into a relational database we view as an ODS (Operational Data Storage), whose content will have to be re-modeled in a multidimensional way to allow its storage in a star schema-based warehouse and, later, its analysis.

Citations (7)

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.