python - Retrieve location data using BeautifulSoup -
i'd location information infobox on following wiki.
here i've tried:
r = requests.get('https://en.wikipedia.org/wiki/alabama_department_of_youth_services_schools', proxies = proxies) html_source = r.text soup = beautifulsoup(html_source) school_d['name'] = soup.find('h1', 'firstheading').get_text() print soup.find('th', text=re.compile("location")).find_next_sibling()
output: none
guessing i'm unable access <td>
element because it's not sibling??
any advice?
>>> table = soup.find("table", class_ = "infobox") >>> name = table.find("th").text >>> country = table.find("th",text="country").parent.find("td").text >>> table = soup.find("table", class_ = "infobox") >>> name = table.find("th").text >>> country = table.find("th",text="country").parent.find("td").text >>> country = table.find("th",text="country").find_next_sibling().text #also works >>> location = table.find("th",text="location").parent.find("td").text >>> location = table.find("th",text="location").find_next_sibling().text #also works
something that?
Comments
Post a Comment