Command-line interface crawler for Subito.it
- Scan all the search result pages
- Smartphone notifications with Pushover
- Detect changes on old listings
- Sold items can be excluded
- Price range filtering
- Filter items applying regex matching on titles
# Clone this repo somewhere
git clone https://github.com/Kianda/subitoo.git subitoo && cd subitoochmod +x subitoo.sh
# Then use the file ./subitoo.sh to execute Subitoo (or set any alias you want)# Optional: set a 'subitoo' alias
sed -i '/alias subitoo=/d' ~/.bash_aliases; echo "alias subitoo='$(pwd)/subitoo.sh'" >> ~/.bash_aliases && source ~/.bashrc# Optional: create .env file to set a custom image tag (or will fallback to TAG "1")
cp .env.example .env# cd /absolute/path/to/subitoo/ and do it manually
# (or set a cron for it -> check 'Cron' section)
docker compose pullTo enable notifications you need the APPLICATION_TOKEN and USER_KEY from your Pushover account.
You can copy your USER_KEY at the Pushover homepage after you logged in.
You can copy your APPLICATION_TOKEN after you've created a Pushover app; give it a name and, if you want, a 72x72 image.
Then save the keys inside Subitoo with APPLICATION_TOKEN:USER_KEY format like this:
# example
subitoo config --setPushoverKeys abcd11e25fg8h5i1yg14abc2c8u28o:abc52de1tx9z315ppq5zzb43a1v6hcExecute a test:
subitoo maintenance --testNotificationGo to Subito.it, permorm a search (apply all the filters you want) and copy the URL.
# example
https://www.subito.it/annunci-lombardia/vendita/usato/?q=nvidia+gtx+1060&qso=trueSave it on Subitoo:
# example
subitoo add --name "GTX 1060" --url "https://www.subito.it/annunci-lombardia/vendita/usato/?q=nvidia+gtx+1060&qso=true"You can check all your saved URLs with:
subitoo lsRun Subitoo, it will notify you if new items appear on that search:
# This is a one-time run.
# You need to execute this everytime (check the 'Cron' section)
subitoo runTo run Subitoo automatically use your operating system job scheduler, like cron
crontab -e# This will run Subitoo every 2 hours
0 */2 * * * cd /your/absolute/path/to/subitoo/ && ./subitoo.sh run
# And update once a day
0 0 * * * cd /your/absolute/path/to/subitoo/ && docker compose pull
To learn more please use the built-in helper
subitoo --help
subitoo run --help
subitoo add --help
subitoo list --help
subitoo delete --help
subitoo enable --help
subitoo disable --help
subitoo maintenance --help
subitoo configuration --helpMore complex subitoo add example, this will search for:
- iPhone keyword (url parameter)
- All Italy as location (url parameter)
- Only in the listings title (url parameter)
- Only with shipping available (url parameter)
- Minimum price is 200
- Maximum price is 450
- Will ignore already sold items
- Will ignore if the price is missing
- Scan only the first 2 pages of the results
- Apply regex (?i)^(?=.*plus)(?!.*iphone 12) on listing title
subitoo add --name MyiPhone --url "https://www.subito.it/annunci-italia/vendita/usato/?q=iPhone&qso=true&shp=true" --pages 2 --minPrice 200 --maxPrice 450 --skipNoPrice --skipSold --regex '(?i)^(?=.*plus)(?!.*iphone 12)' --skipSold --skipNoPrice```If you want, you can build your own image:
# cd /your/absolute/path/to/subitoo/
export TAG_VERSION='1.1' && \
export TAG_MAJOR='1' && \
export HUB_PATH='kianda/subitoo' && \
docker build -f Dockerfile --no-cache -t $HUB_PATH:$TAG_VERSION -t $HUB_PATH:$TAG_MAJOR -t $HUB_PATH:latest . && \
docker push $HUB_PATH:$TAG_VERSION && \
docker push $HUB_PATH:$TAG_MAJOR && \
docker push $HUB_PATH:latestRefactor Subitoo into a maintainable, object-oriented, and modular framework.
- Apply object-oriented design principles to the core architecture
- Decouple site-specific scraping logic from the core engine
- Allow additional website scrapers to be added as independent modules
The first run of a search query will not send notifications, it will only populate the database.
You will receive notifications on consecutive runs if there is a new item or any old one is changed.
Only if your URL is safe, by safe I mean that it doesn't return too many results.
Check this URL for example:
https://www.subito.it/annunci-italia/vendita/usato/?q=apple
This will give you more than 70.000 results! That's like 300 pages that Subitoo need to scan, it will take time!
So, please, use --pages 0 only if you know what you are doing.
Check all the parameters for the add command here:
subitoo add --helpNothing, there is a built-in lock, if you execute subitoo run and the previous execution is still running it will be ignored.
Inside the 'data' folder you will find the database and the logs. Feel free to back it up to prevent data loss.
If an item is already in the database and gets scanned again, it will be compared against the existing version.
NOTICE: If your search is reading only page 1 (of the results) then all the items that end up into pages > 1 will never be read again until they go back into page 1.