C# Web Scraping Library Details


Shareware 1001 KB

The C# web-scraper framework is a sophisticated set of C# screen scraping classes (also available on Nuget) which make extracting data form web applications and turning it into .Net objects, JSON, CSVs and Spreadsheets enjoyable. The Iron WebScraper DLL to your project brings in behind the scenes management of threading, document parsing, proxies, headers and cookies so that you can focus on linear scraping logic which is easy to code and debug.

Publisher Description

Window 10 CompatibleThe web-scraper for C# allows .Net developers to create logical that extract content from web applications and turn it into JSON, spreadsheets, C# objects or even SQL using simple C# and Linq code. Iron WebScraper is a web scraping library for the .Net 4.5 and Core platform which allows developers to use clean, simple logic to reverse any web resource back into C# objects or SQL. It can extract pages using set-by-step (if-this-then-that) workflows, effortlessly scraping and parsing html, javascript, xml, RSS, pdfs and office documents on the internet or local intranets back into useful structured data. This leaves the developer with clean, efficient web-scraping applications which are easy to understand and debug. The C# Web Scraping Library is extremely polite, ensuring that no domain or IP address has too many concurrent requests. It intelligently throttles both client and server side looking for excessive CPU usage and slowing to an appropriate pace. In addition, it can obey robots.txt directives including bot specific crawl rates and limitation. The exact urls and content types to be strapped can be set using logical workflows and regex/wildcard rules. Screen-scraping is made easier with identity control, automatically managing threads, rate limits, urls, duplicates, retries, proxies, headers and cookies into a an army of virtual browser which can mimic human behavior and even client buttons, fill in forms or log in behind security walls. This is useful for migrating legacy systems, populating enterprise search facilities and for statistical competitive analysis Full documentation, support and downloadable DLLS for the C# Web Scraper are available from http://ironsoftware.com/csharp/webscraper/ , in addition to links to a .Net 4.5+ Nuget package with full Azure and Mono compatibility.

Download and use it now: C# Web Scraping Library

Related Programs

Data Mining

NeoNeuro Data Mining application is a new generation software which allows machine learning for usual and multidimensional clustering tasks. Simple data mining example is an arithmetic learning. We give example for PLUS sing: 1 1 2 3 0 3 2...

The C# OCR Library

The C# OCR Library by the 'Iron OCR software Development Team' is a software package for C# programmers, adding optical character recognition to desktop and web location. Iron OCR can be used to scan documents or textas image assetsinto plain...

The C# PDF Library

IronPdf - The C# PDF library IronPdf is our Microsoft.net library making it easy for C# Visual Basic developers to generate PDFs from within desktop applications server applications and .Net / ASPX websites. Iron Pdf solves this issue by allowing...

CAD .NET: DWG DXF CGM PLT library for C#

CAD .NET is a library that allows specialists to develop in .NET environment (C#, VB.NET, J#) software to work with CAD files. Its basic features include creating, importing and exporting of CAD formats (AutoCAD DWG, DXF, HPGL, PLT, etc), raster...

JSON library

Delphi and C++ Builder JavaScript Object Notation (JSON) library. Features Read and modify existing json files Create new json files Full JSON supported: literals, numbers, strings, arrays and objects Date/time encoding and decoding supported Customizable output Available for Delphi/C++ Builder...