Improve existing manga reader website by scraping and modifying the DOM with nodejs

Rifki (Kubid) Fauzi
2 min readApr 1, 2021

I stumble across a manga reader website called readsnk.com, while reading manga on it, I experience a minor buggy user experience. for example, when I open the website, it loads all the images simultaneously and not displayed sequentially.

Since the images are not loaded sequentially.. the 30th page can be loaded first then the first page loaded at the end. based on this problem I decided to make a clone of this website.

My approach is to scrape the HTML using HTTP request, modify the HTML by adding native browser lazy loading image, and serve the modified HTML to the client (browser). stack I use are nodejs, express & Axios

TL;DR compare improved manga reader vs original

Scraping the page using Axios

Modifying HTML DOM using regex

you can use a dom parser like cheerio in node js, but it’s overkill for a small project like this, as you can see in the image above. I made 4 changes to the DOM.

.replace(/<img/g, "<img loading=\"lazy\" width=\"1600\" height=\"1124\"")

for appending loading=lazy to all <img/> tag, you must notice that loading=lazy will not work unless you set the width and height , by adding this attribute the browser will load images inside your browser viewport.

.replace(/href\=\"\/css/g, 'href="https://ww7.readsnk.com/css').replace(/src\=\"\/js/g, 'href="https://ww7.readsnk.com/js')

replacing all tag attributes containing href and srcfromrelative path to absolute path.

you can check the demo and source code here: source demo

I also add Redis caching to the app, I’ll write about it in the next post.

--

--