Annotating Text from News Articles to Enhance the Performance of an AI Model

Text Annotation - Case Study

The Company: German Construction Technology Company

Industry: Real Estate

Company Headquarters: Germany

Annotating Text from News Articles to Enhance the Performance of an AI Model

Request for Information

If you’d like to get more information about the case study and pricing, please contact our experts today.

Get in Touch →
objective-icon
The Objective:

Our client manages a platform publishing project data across the USA and Europe for small businesses to Fortune 500s in real estate and manufacturing. They wanted to gather real-time, multilingual project data from varied online sources.

While AI handled most of the data, 20% complexities needed manual verification and annotation for complete accuracy. They joined hands with HabileData to verify auto-classified text, appending missing details, and manually annotating unclassified content.

challenges-icon
The Challenges:
  • Navigating Complex Data Landscapes - Extracting and accurately comprehending contextual information from construction articles required precise tagging and labeling like project size, phase, location, owner, architects, and start and end dates.
  • Managing Data Surge - Processing voluminous data, with hundreds of construction articles pouring in daily, required swift and efficient handling. Ensuring that each article was processed within a 24-hour timeframe while maintaining stringent quality standards was a formidable task.
  • Cultivating Expertise - Training a specialized team to interpret complex architectural data and validate auto-classified information required significant investment in education and hands-on training.
solution-icon
HabileData’s Solution

HabileData classified and labeled over 10,000 construction-related articles, ensuring rigorous validation, verification, and data append processes to enhance the client's AI algorithm accuracy and efficiency.

Input Data - Access to construction-related news articles from multi-lingual and multi-format sources.

The task was divided into two parts

  • Accuracy check of auto-tagged data through verification, validation, and data append.
  • Manual annotation of the 20% of data that couldn’t be auto-classified.

Accuracy Verification Process

  • The data was cleaned and formatted for consistency.
  • Through a robust validation and verification process, we ensured that the auto-tagged data was accurate and reliable.
  • Statistical measures like precision, recall, and F1 score were used to evaluate the accuracy of auto-tagged data.
  • We meticulously appended any missing details to the data, enhancing its completeness and accuracy.

Manual Text Annotation

  • Developed clear and concise guidelines for annotators to handle the 20% of data that couldn't be auto-classified.
  • Established KPIs, SOPs, and metrics based on a thorough project assessment and a deep understanding of the client's needs.
  • A pilot phase was conducted to fine-tune the approach and identify potential bottlenecks before full-scale implementation.
  • Human annotators categorized text based on predefined criteria.
  • Texts, PDFs, and images were labeled by assigning metadata or tags.
    • Text Data - Reviewed and labeled text passages, marking entities and sentiments.
    • PDF Data - Extracted and tagged text and structured data like tables and charts.
    • Image Data - Labeled objects and features in images for tasks such as object detection and classification.
  • Quality Control - We implemented a two-step quality check process for annotated articles, involving review and validation by experienced annotators and quality specialists to ensure precision and reliability.
  • Deliverables – Classified and labeled over 10,000 construction-related articles.
  • The annotated dataset was exported in a machine learning-compatible format for training and validation.

Post Project Support

  • Provided ongoing monitoring and maintenance, including periodic quality assessments and updates, to sustain AI model accuracy and adapt to industry trends.
  • Offered training refreshers and support to ensure the client's team remained proficient, fostering a long-term partnership for lasting results.
solution-icon
Business Impact
  • Reduced data processing time from days to just a few hours.
  • Increased algorithmic accuracy resulted in higher customer acquisition.
  • The offshoring model saved 50% on project costs for the client.

Value Addition

Automated classification and validation, combined with accurate annotation of text from thousands of news articles, significantly scaled the AI model for the German construction technology company.

Talk with our expert team who can help you with all your queries.

Get a Quote
Back to top

Disclaimer:  

HitechDigital Solutions LLP and HabileData will never ask for money or commission to offer jobs or projects. In the event you are contacted by any person with job offer in our companies, please reach out to us at info@habiledata.com

X