Scrapy Csv Output "randomly" Missing Fields
My scrapy crawler correctly reads all fields as the debug output shows: 2017-01-29 02:45:15 [scrapy.core.scraper] DEBUG: Scraped from <200 https://www.willhaben.at/iad/immobilie
Solution 1:
If you yielding scraped results as dict, CSV columns will be populated from the keys of first yielded dict:
def_write_headers_and_set_fields_to_export(self, item):
if self.include_headers_line:
ifnot self.fields_to_export:
ifisinstance(item, dict):
# for dicts try using fields of the first item
self.fields_to_export = list(item.keys())
else:
# use fields declared in Item
self.fields_to_export = list(item.fields.keys())
row = list(self._build_row(self.fields_to_export))
self.csv_writer.writerow(row)
So you should either define and populate Item
with all the fields defined explicitly, or write custom CSVItemExporter
.
Post a Comment for "Scrapy Csv Output "randomly" Missing Fields"