Parsing downloaded HTML content

Hello, integrators!

I would like to create a watchdog for monitoring prices on predefined e-shops. I’m using a REST API connector to download the content of the HTML pages, and it’s working perfectly.

However, I’m encountering difficulties when it comes to extracting the name and price using JS Mapper. Could you provide me with some hints or guidance on this matter?

For example, I want to download the name and the price from this URL:

The price is located in the folloving DIV marked with m-price__price class.

image

How can I effectively get it?

Thank you!

Hi Marian,

do you mean something like this: node-html-parser - npm?

Tomas.

Exactly. Unfortunately, I think I cannot use this library in a JS mapper.

This is available in Node.JS processor only. It is available in Integray Premium license. In current version we can add 3rd party modules using special workaround, however in future release we will have possibility to add modules for Node.js and Python services directly from application.

Hi Marian,

In case of standart license or trial environment, you may use regular expressions to find relevant content in HTML page. It’s quite far from ideal, but possible solution.

Either way, scrapping HTML pages will likely lead to errors on unexpected result, because you do not control 3rd party web pages and those pages may change anytime.

Thank you, Libor, I’ll use the Regex!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.