Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Search Results - link url isn't returning properly, among others #227

Open
smithellis opened this issue Oct 30, 2018 · 5 comments
Open

Comments

@smithellis
Copy link

When I return results from a google scrape, some values are now incorrect or missing. 'domain' is consistently returned as "b' '", for instance. 'link' is now regularly 'None'. link_type, serp_id, snippet and visible link all seem ok. Was there a recent change that might have caused this, or is something borked on my end?

Thanks.

@symbios-zi
Copy link

@smithellis you can fix it in parsing.py located in this package. Seems like google was changed markup.

                'link': 'div.r > a:first-child::attr(href)',

@smithellis
Copy link
Author

Thanks - I had actually tried updating the parsing.py file, and used: 'link': 'h3.LC20lb > a:first-child::attr(href)' -- But that didn't get it. I changed it in the google search area, under us_ip and de_ip. Either my selector is hot garbage or something is borked.

@smithellis
Copy link
Author

I just realized my bug report is like...the worst.

I am using the version from Git - 0.2.4
I am running it in a virtualenv with Python 3.5.3
I'm using this inside a Flask application, but can confirm the issue outside the Flask app.

Hope that makes me less of a nuisance and more of a helper - I like this app.

@symbios-zi
Copy link

@smithellis

as I see in russian side of google the markup is:

<a href="http://www.cambridgeenglish.org.ru/test-your-english/adult-learners/">
    <h3 class="LC20lb">Title</h3>
    <br>
</a>

@johnbaris92
Copy link

Is there an update for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants