Tuesday, March 12, 2024

Extracting Meta Tag Data Utilizing JavaScript

Must read


Introduction

When constructing, analyzing, or scraping internet pages, it is usually essential to extract meta tag data. These tags present knowledge in regards to the HTML doc, like descriptions, key phrases, creator data, and extra.

On this Byte, we’ll clarify the right way to extract this knowledge utilizing JavaScript.

Retrieving Meta Tag Information

To retrieve meta tag knowledge, we will use the querySelector() technique in JavaScript. This technique returns the primary Factor throughout the doc that matches the required selector, or group of selectors.

Here is an instance:

let metaDescription = doc.querySelector("meta[name="description"]")
                      .getAttribute("content material");
console.log(metaDescription);

On this code, we’re querying for a meta tag with the title ‘description’ after which getting the ‘content material’ attribute of that tag. The console will output the outline of the web page.

Working with Open Graph (OG) Meta Tags

Open Graph meta tags are used to counterpoint the “preview” of a webpage on social media or in a messenger. They permit you to specify the title, description, and picture that might be used when your web page is shared.

To fetch the Open Graph title of a web page, you should utilize the next code:

let ogTitle = doc.querySelector("meta[property='og:title']")
              .getAttribute("content material");
console.log(ogTitle);

This code fetches the Open Graph title of the web page and prints it to the console.

Fetching Information from All Doc Meta Tags

If you wish to fetch knowledge from all of the meta tags in a doc, you should utilize the getElementsByTagName() technique, which returns a reside HTMLCollection of parts with the given tag title.

Here is how you are able to do it:

let metaTags = doc.getElementsByTagName("meta");

for (var i = 0; i < metaTags.size; i++) {
    console.log(metaTags[i].getAttribute("title") + " : " + metaTags[i].getAttribute("content material"));
}

This code will output the “title” and “content material” attributes of all of the meta tags within the doc.

Retrieving Meta Tags Utilizing Node.js

Up till this level we have seen the right way to extract the meta tag knowledge utilizing JS in-browser. We all know this as a result of all examples have used the doc object, which is barely accessible in browser environments. Let’s now see how you are able to do this from a unique JS runtime, like Node.

Assuming you’ve Node and npm in your machine, set up the axios and cheerio libraries:

$ npm set up axios cheerio

Hyperlink: To be taught extra about the right way to use the Axios library, learn our article, Making Asynchronous HTTP Requests in JavaScript with Axios.

To be taught extra about Cheerio.js, see our information, Construct a Internet-Scraped API with Categorical and Cheerio.

Load the libraries into your script utilizing the require command:

const axios = require('axios');
const cheerio = require('cheerio');

And now we’ll use Axios to fetch the online web page we’re fascinated with. It returns a promise, so be sure to deal with it correctly with async/await or a .then() block.

strive {
    const response = await axios.get('https://instance.com');
    
    // Extract the web page knowledge right here...
} catch (error) {
    console.error(error);
}

Now we will use Cheerio.js to extract the meta tags from the HTML we have fetched. If you happen to’ve ever labored with jQuery, you will discover how comparable Cheerio.js is.

strive {
    const response = await axios.get('https://instance.com');
    const $ = cheerio.load(response.knowledge);
    const metaTags = $('meta');

    metaTags.every((i, tag) => {
        const title = $(tag).attr('title');
        const content material = $(tag).attr('content material');
        console.log(`Meta title: ${title}, content material: ${content material}`);
    });
} catch (error) {
    console.error(error);
}

What we have performed right here is load the HTML response into Cheerio, after which grabbed all of the meta tags. We looped by means of every and printed out the “title” and “content material” attributes. You possibly can simply modify this code to seize different attributes or construction the information as wanted.

Conclusion

On this Byte, we have explored the right way to extract meta tag data from a webpage utilizing JavaScript. We lined the right way to retrieve particular meta tag knowledge, work with Open Graph tags, and fetch knowledge from all meta tags in a doc.

We additionally noticed the right way to extract meta tag data from different JavaScript runtimes, like Node.js utilizing the Axios and Cheerio.js libraries.



Supply hyperlink

More articles

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest article