How to do web scraping in Java – Part II

In second part, we will build the program. For previous step(s), follow the Part I.

Step IV : ( Using Jsoup )

  • Add the following code in the main function of your class.


  • final Document document ;
     Scanner input = new Scanner(;
     System.out.print("Enter the search : >>>> ");
     String name = input.nextLine();

    Note : Make sure Document in imported from jsoup dependency.

Important :

  • Next, you need to read how your browser works. Search any query and notice the pattern how the link is generated for your browser.Add the next line,
  •  document = (Document) Jsoup.connect(""+name).get();
  • Now in the video, we noticed a list of <li> elements, we need to run a for loop to get all the elements so we need to figure out under which class in Inspector do they reside.


  • Using the class, declare a for loop to select each element in that class. So in next line,
  • for(Element row :"ol.mb-15.reg.searchCenterMiddle li")){
  • Now, search in each <li> elements what all things you want to get. Say, I want only the headings & links of the search results.
  • So I search for the tag which has the heading & their link.

For Heading(s):


For link(s) :


You can see the class name under which we can find our heading and their link. So, we just need to search for those tags in each <li> elements.  So inside of for loop –

  • String title =".title").text();
     String url ="span.fz-ms.fw-m.fc-12th.wr-bw.lh-17 ").text();
     System.out.println(title + "-> URL: " + url+ "\n\n");
     title = title.replace(",", "|");
  • The last line is because sometimes when you export the search result to csv/excel format. It prints them into separate blocks if “,” is not replaced with “|”.

That’s it. You can now build your program and give in any input as seach result. You can get the scraped data. Now, you can export them to txt , csv or whatever formats.

You can get the entire project here.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s