top of page

Automated news scraping: how a legal company is keeping up with M&A news


Automated news scraping: how a legal company is keeping up with M&A news

Mergers and acquisitions (M&A) are highly dynamic, and legal departments and firms advising on such deals need real-time insights. Manually tracking M&A news is inefficient, given the volume of daily updates. At Automations Lab we helped a legal company streamline this process using Apify, ChatGPT, and Make.com, creating an automated workflow that scrapes news websites, filters relevant articles, and extracts key deal details.


Let’s break down how this system works and how it ensures accurate, real-time M&A intelligence.


The Make.com Scenario: A Fully Automated M&A News Scraping Pipeline


Every day at 9 AM, the automation triggers, executing a structured workflow to scrape, filter, and process M&A news. Here’s how it works:


1. Tracking News Sources in Google Sheets


A Google Sheet serves as the control panel. It contains a list of URLs from trusted news websites, ensuring the system focuses on relevant sources rather than scraping the entire web.


Why Google Sheets? It provides flexibility—new sources can be added anytime without modifying the automation.


2. Scraping Articles with Apify


Once the URLs are listed, the scenario triggers an Apify Actor that scrapes each URL one by one, extracting the full article content.


Apify operates on a pay-as-you-go model, making it cost-effective for targeted scraping. A single crawl costs ~$0.10 to ~$0.25, making it affordable even for daily operations.


3. Extracting Metadata with an HTTP Module


After scraping, an HTTP request module scans each article’s HTML structure to extract metadata like:

  • Title

  • Author (if available)

  • Source website

  • Publish date


This helps categorize and validate the content before deeper analysis, and is critical, as it allows filtering out old articles and focusing only on fresh M&A news (many websites don’t provide structured metadata for dates).


4. Filtering for the Last 24 Hours


Since Apify does not inherently filter articles by date, the scenario applies an additional Make.com filter to process only articles published in the last 24 hours. This ensures that the legal team receives only the most recent and relevant updates.



5. ChatGPT Classifies Articles: M&A or Not?


At this stage, ChatGPT steps in to analyze the article’s content and classify whether it’s actually about M&A transactions.


Many general business news articles mention company names without discussing acquisitions. ChatGPT scans the text and returns a YES/NO decision on whether the article is M&A-related. This prevents unnecessary processing and keeps the dataset focused.


6. Extracting Key Deal Information with ChatGPT


If an article is about M&A, ChatGPT performs a deeper analysis and extracts:

  • Buyer’s name

  • Sold company

  • Transaction date

  • Deal value (if available)

  • Industry sector

  • Short summary of the transaction


This step is customizable, allowing the legal team to define what information is most valuable for their work.


7. Delivering the Insights via Email or Slack


Once all data is processed, Make.com compiles the extracted information into a structured message and sends it to a pre-defined recipient list via Email (daily report format) and Slack (for instant updates).


This ensures that key decision-makers stay informed without manually searching for news.


8. Storing Data in Google Sheets for Easy Access


Finally, all extracted information—including links, text, and structured M&A details—is stored in Google Sheets. This allows the legal team to build a historical database of M&A news. Data can be searched, filtered, and analyzed anytime.


Make.com and Apify for automated news scraping

Cost & Efficiency: A Highly Scalable Solution


One of the biggest advantages of this automation is its low cost compared to manual research. Here’s a rough estimate of daily expenses:

  • Apify: ~$0.10 to ~$0.25 per crawl (free tier covers up to $5/month, then $39/month).

  • ChatGPT: Costs range from $0.10 to $1 per day, depending on usage.

  • Make.com: Pricing varies but remains significantly lower than hiring a manual research team.


This setup scales effortlessly—new sources can be added, and filters can be refined without additional development costs.


Also, it's applicable to any company, department and news type.


Why This Matters for Legal Firms


M&A lawyers rely on speed and accuracy. Missing a major deal announcement or reacting late to an acquisition can impact client relationships and strategic advice. This automation:

  • Saves hours of manual research every day.

  • Filters out irrelevant news, reducing information overload.

  • Delivers structured insights, making it easy to act on new developments.


Would this system improve your legal research process? If you deal with M&A analysis, compliance, or corporate law, this approach can give you a competitive edge in tracking market movements.



Frequently Asked Questions (FAQ)


1. What are the main benefits of automating news tracking?


Automating news tracking saves time, ensures real-time updates, and reduces the risk of missing critical news. Instead of manually searching for relevant articles, legal teams receive structured insights directly via email or Slack. This improves decision-making speed and allows firms to focus on strategic analysis rather than data collection.


2. How does ChatGPT determine if an article is related to a specific topic?


ChatGPT analyzes the full text of each scraped article and identifies key terms, phrases, and contextual clues that indicate whether it discusses a specifi topic. It then returns a simple YES/NO classification. This step helps filter out irrelevant business news that may mention company names but are not related to actual transactions.


3. Can I customize the extracted information based on my firm's needs?


Yes, the system is fully customizable. You can define which details matter most—such as buyer/seller names, transaction value, industry sector, or specific keywords—and adjust ChatGPT’s extraction prompts accordingly. The workflow can also be modified to track additional sources or deliver reports in different formats (e.g., CSV exports).


4. What are the costs associated with this automation?


The cost depends on usage but remains significantly lower than manual research efforts:

- Apify: ~$0.10 per crawl (with free tier options).

- ChatGPT API: Costs vary from $0.10 to $1 per day based on volume.

- Make.com: Pricing depends on scenario complexity but is scalable as needed.


Overall, this setup provides a cost-effective alternative to hiring analysts for daily news monitoring.


5. How can I implement a similar automation for my firm?


If you want to automate your news tracking or any other research process, Automations Lab can help design a custom workflow tailored to your needs using tools like Make.com, Apify, and ChatGPT. Contact us today to discuss your requirements and start optimizing your firm's efficiency!

コメント


bottom of page