python - Why doesn't findall work on the table data I pulled from javascript (using python3) -


i can 2 parts of want separately, not together. i'm using anaconda 3 distribution.

the tables want dynamically loaded javascript tables , want extract them use them in pandas , sqllite3.

this give me text output, delivered separate lines each cell, of information want:

import sys pyqt5.qtwidgets import qapplication pyqt5.qtcore import qurl pyqt5.qtwebkitwidgets import qwebpage  import bs4 bs import requests import pandas pd  class client(qwebpage):      def __init__(self, url):         self.app = qapplication(sys.argv)          qwebpage.__init__(self)           self.loadfinished.connect(self.on_page_load)         self.mainframe().load(qurl(url))         self.app.exec()      def on_page_load(self):         self.app.quit()    #only run until page loads  f = open('hmarathont.txt', 'w') url = 'http://results.houstonmarathon.com/2017/?page=2&event=mara&pid=search&search%5bclub%5d=%25&search%5bcompany%5d=%25&search%5bnation%5d=%25&search_sort=name'     client_response = client(url)     source = client_response.mainframe().tohtml()    soup =bs.beautifulsoup(source, 'lxml') js_table = soup.find("table", {"class": "list-table"}) js_table_content = js_table.text  f.write(js_table_content) 

which great, imagine taking text file , parsing right format (currently it's line each cell , want original format) not best way it, when there way direct table straight pandas.

i have had success getting tables pandas, code wrote below, not dynamically loaded tables. have code works taking csv table data sqlite3, that's why export csv dataframe.

rows = js_table.findall('tr')[1:]  data = {'ovrl_pl' :[],'gndr_pl' :[],'place_div' :[],'raw_name' :[],'gender' :[],'state' :[],'age_cat' :[],'age' :[],'time_net' :[],'time_gun' :[]}  row in rows:     cols = row.find_all('td')      data['ovrl_pl'].append( cols[0].get_text() )      data['gndr_pl'].append( cols[1].get_text() )     data['place_div'].append( cols[2].get_text() )      data['raw_name'].append( cols[3].get_text() )     data['gender'].append( cols[4].get_text() )     data['state'].append( cols[5].get_text() )     data['age_cat'].append( cols[6].get_text() )     data['age'].append( cols[7].get_text() )       data['time_net'].append( cols[8].get_text() )     data['time_gun'].append( cols[9].get_text() )  hmarathon = pd.dataframe(data) hmarathon.tocsv("hmarathon2017test.csv")     

the error keep getting @ point try use findall on js_table. nonetype, i'm guessing existed briefly , gone? (or maybe don't understand of :-/ ).

i managed capture table cell contents text, why can't capture dictionary?


Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -