Improve existing manga reader website by scraping and modifying the DOM with nodejs
I stumble across a manga reader website called readsnk.com, while reading manga on it, I experience a minor buggy user experience. for example, when I open the website, it loads all the images simultaneously and not displayed sequentially.
Since the images are not loaded sequentially.. the 30th page can be loaded first then the first page loaded at the end. based on this problem I decided to make a clone of this website.
My approach is to scrape the HTML using HTTP request, modify the HTML by adding native browser lazy loading image, and serve the modified HTML to the client (browser). stack I use are nodejs, express & Axios
Scraping the page using Axios
Modifying HTML DOM using regex
you can use a dom parser like cheerio in node js, but it’s overkill for a small project like this, as you can see in the image above. I made 4 changes to the DOM.
.replace(/<img/g, "<img loading=\"lazy\" width=\"1600\" height=\"1124\"")
loading=lazy to all
<img/> tag, you must notice that
loading=lazy will not work unless you set the
height , by adding this attribute the browser will load images inside your browser viewport.
.replace(/href\=\"\/css/g, 'href="https://ww7.readsnk.com/css').replace(/src\=\"\/js/g, 'href="https://ww7.readsnk.com/js')
replacing all tag attributes containing
relative path to
I also add Redis caching to the app, I’ll write about it in the next post.