Technology works best when it is focused on business solutions
 
Data Scraping and System Integration

Data Scraping and System Integration

The propensity of data scraping and system integration projects shows we are all figuring out the value of data in our solutions. However, there seems to be a disconnect between this common pursuit and the paths and challenges it offers. The ideas of data scraping and system integration are often seen as very different, but they are the same. We can scrape data from a site or application to integrate with it. Yes, that is typically a one-way integration, but it can go both ways.

Methods Of Data Scraping And System Integration

This topic is best started with a list of how to move data from one application to another. Data scraping is one way and likely the most complicated and fragile approach. Therefore, it is essential to consider other options before tackling that “worst-case” option. Integration can be achieved through the following techniques. It is not a comprehensive list but a good overview. For example, we will skip the manual option, which is always available.

  • Web Scraping
  • Direct Link/SSO
  • Application Programming Interface (API)
  • Data feeds (RSS or other)
  • FTP/SFTP and data import/export

Each option has pros and cons worth examining before you start your next integration project.

Web Scraping

This approach is as close to manual integration as you can get. The way it works is the application has a process that connects to the desired system in a way that is identical to a user. For example, the application will open a browser window, enter a URL, send login information, click on buttons, and then copy data from screens and paste it into its data stores. You can mimic a data scraping process by logging into your chosen site, navigating to a desired page, and then writing down values you see on the page. This is a brute-force approach that requires nothing from the integrated system. However, some sites and applications restrict usage and timing. That can frustrate scraping attempts and sometimes is intentional.

Direct Link/SSO

This approach is often preferred over scraping, although not much different. We scrape data to pull it from another application into our own. A direct link or SSO configuration can bring the application into our own. The user often sees this as a window within the primary application or as popping open a new window or tab. It can sometimes be a very fluid user experience that is easy to implement. While that integrated application binds the user to its experience, it also will be familiar to them if they have a history of using it separate from your solution.

Application Programming Interface

The API solution often is the smoothest way to integrate data between systems. It can be complex and also comprehensive. There are many ways to achieve this, but they can be simplified down to two applications “talking” to each other. While it requires the interface to be defined and stable, it is one of the best ways to create a solution written once and requires little maintenance.

Data Feeds

A data feed is similar to an API in many ways. The main difference is that an API can be interactive and done at a data item level. A data feed is a data stream or similar query that can then be processed and pulled into a system. That can be beneficial when real-time integration is not needed. It is often associated with “nightly processing” and can remove the burden of extra processing during peak times of the day.

Data Import/Export

A step down from a data feed is an import/export. This can be done on an ad hoc basis and is the equivalent of taking a snapshot of a data block. Data feeds provide a data stream, while an import and export add an integration (import or export the data) and work with files. There are many similarities, but the import/export process does tend to be more static and limited in its use.

Next Steps

Feel free to schedule a time to discuss your next integration project with us. Our experience has taught us a lot about the pitfalls and challenges of these project types. Likewise, we have an e-book that can help you explore all the steps in building software, including a few templates. However, we ask that you share an e-mail address so we can send you a copy. We will add you to our monthly newsletter, but you can unsubscribe anytime. Your data is not shared with anyone else. Learn more about our book here.

Leave a Reply