How to scrape data on Leboncoin with no code for free?
Leboncoin is a fantastic source of data. According to Antoine Jouteau for Le Parisien, on May 26, 2020, the CEO of the site, the French platform for listing ads totals 30 million ads posted simultaneously. With 3 priority sectors: automotive, real estate, and employment.
Beyond the colossal volume, each ad on leboncoin is an exceptional mine of information. On a real estate ad, you can find the title of the ad, but also the pictures, the price of the property, the location, the phone number of the advertiser, the number of rooms...
With a little Python script, it should do it... A few lines of code, et voilà!
import requests
s = requests.Session()
s = requests.Session()
r = s.get('https://www.leboncoin.fr/ventes_immobilieres/2156084495.htm')
with open('2156084495.htm', 'w') as f:
f.write(r.text)
However, at runtime, it's a cold shower, the site is protected by datadome, a French company specialized in bot-mitigation - which detects visitors considered as robots... and forbids them to access the site. However well-intentioned they may be.
So, how can you scrap on leboncoin, and get all the new ads of a given category every day, without any code, without any trouble?
In this article, we'll see how to do that in about 30 seconds. No more and no less.
🤖
First of all, we're going to go to leboncoin and take the search URL - this is the initial URL from which the robot will retrieve the ads.
Just go to leboncoin, use the set of search criteria of your choice, and retrieve our precious URL.
For example, all the houses for rent in Cassis, in the south of France, this summer:
You have to know how to treat yourself...
🌞
And here is our search URL, to keep preciously for the future: https://www.leboncoin.fr/recherche?category=9&locations=Cassis_13260__43.21419_5.54296_4846
Now let's set up the online collection tool! 0 lines of code. 30 second top-chrono. As easy as that.
First let's quickly go to the nice scraping tool, directly controllable from an interface, right here:
https://lobstr.io/store/33db1ca85160105eeb84d5aa51cfad10/leboncoin-iter-listings
And click directly on "Start Now":
by clicking on the right of "Output" on the little download icon, you can download a sample of a hundred lines, and already appreciate the data format
Then, let's create a new cluster:
The UX has been really well thought out.
Then we type "leboncoin" to quickly find the crawler we want to use:
Be careful, you have to choose 'Leboncoin Listings Search Export' - and not its alter-ego with phone! The other crawler allows you to obtain the precious phone numbers present in the ads. But you absolutely have to provide a leboncoin account. And it's only usable if you've signed up for a paid plan.
Let's launch!
First, place the previously saved URL in the URL field (1).
Also, we don't want to collect ads that are more than 24 hours old. An old ad is an ad that is already losing its value. In the field 'Hours Back', we will thus inform 24 (2).
Finally, click on the nice button Save (3):
On the next frame, click on 'Manually' (4) - we will launch the crawler only once, a click by hand will be enough.
And simply press Save (5):
If you want to launch the crawler for example every morning at 8am, you can click on 'Repeatedly'. No need to get up early in the morning and launch your run by hand, we take care of everything!
And that's it, your cluster is created!
To launch it, just click on 'Launch' in the upper right corner:
And here it is, the machine is launched!
5 steps, 30 seconds of deployment. No more, no less.
Click quickly on the 'Run' that has just been launched:
You are now on the 'Run' page, where the results will be displayed in real time. And after a few minutes of waiting... what a pleasure!
Superb data, cleanly structured and directly usable (1). And directly usable data by pressing the big red button (2) on the top right, as shown below:
And once opened in Numbers, here's the work - great real estate data, cleanly structured, directly actionable, and collectable for free. In 5 clicks. Simply magical.
In the end, we collected, in 1 minute, without code, 35 real estate listings i.e. approximately 35 listings per minute, without phone number. With our free plan of 15 minutes per day, you will be able to collect approx. 500 ads per day without phone number. Without spending a penny.
The free plan is (fortunately?) limited to 15 minutes per day, and it is not possible to collect phones. If you need phone numbers and/or more collection time, you can go directly here.
Leboncoin is a great source of data. With more than 135 million monthly visits, it is the leading classified ad site in France, particularly in the real estate, automobile and employment sectors.
However, the 'bot-mitigation' tools make any automatic collection impossible, and force users to manually collect data.
With lobstr, collect data in 5 clicks, on the target of your choice. No code required. Completely free.
Happy scraping!
🦞
Co-founder @ lobstr.io since 2019. Genuine data avid and lowercase aesthetic observer. Ensure you get the hot data you need.