Hallo,
ich versuche aus einer Webseite (Beispiel: https://www.lovelybooks.de/autor/Fredrika…mer-1108974876/) ein JSON Element rauszulesen, welches folgende Form hat:
Code
<script type="application/ld+json">{"@context":"http://schema.org","@type":"ItemList","itemListElement":[{"@type":"Book","name":"Die Holzhammer-Methode","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/Die-Holzhammer-Methode-945569092-w/","author":{"@type":"Person","name":"Fredrika Gers","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/"},"position":1},{"@type":"Book","name":"Teufelshorn","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/Teufelshorn-1046225387-w/","author":{"@type":"Person","name":"Fredrika Gers","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/"},"position":2},{"@type":"Book","name":"Gut getroffen","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/Gut-getroffen-1112506490-w/","author":{"@type":"Person","name":"Fredrika Gers","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/"},"position":3},{"@type":"Book","name":"Frühjahrsputz","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/Fr%C3%BChjahrsputz-1161904735-w/","author":{"@type":"Person","name":"Fredrika Gers","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/"},"position":4},{"@type":"Book","name":"Mord am Toten Mann","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/Mord-am-Toten-Mann-1451954751-w/","author":{"@type":"Person","name":"Fredrika Gers","url":"https://www.lovelybooks.de/autor/Fredrika-Gers/"},"position":5}]}</script>
Dazu habe ich 2 Seiten gefunden:
https://stackoverflow.com/questions/4365…on-using-python
https://stackoverflow.com/questions/3616…g-beautifulsoup
nun versuche ich mein Glück:
Python
from BeautifulSoup import BeautifulSoup
import urllib2
import json
url = 'https://www.lovelybooks.de/autor/Fredrika-Gers/reihe/Franz-Holzhammer-1108974876/'
html = urllib2.urlopen(url).read()
xbmc.[definition='1','0']log[/definition]('HTML output %s' % (html))
soup = BeautifulSoup(html, "html.parser")
raw_data = soup.find('script', {'type':'application/ld+json'})
data = json.load(raw_data)
xbmc.[definition='1','0']log[/definition]('JSON output %s' % (data))
Alles anzeigen
Das lesen der HTML Datei funktioniert scheinbar noch Problemlos, wohingegen ich aber in der Zeile soup = BeautifulSoup(html, "html.parser") folgende Fehlermeldung bekomme:
Code
ERROR: EXCEPTION Thrown (PythonToCppException) : -->Python callback/script returned the following error<--
- NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
Error Type: <type 'exceptions.AttributeError'>
Error Contents: 'str' object has no attribute 'text'
Traceback (most recent call last):
File "C:\Users\User\AppData\Roaming\Kodi\addons\plugin.video.example\main.py", line 301, in <module>
router(sys.argv[2][1:])
File "C:\Users\User\AppData\Roaming\Kodi\addons\plugin.video.example\main.py", line 253, in router
list_categories()
File "C:\Users\User\AppData\Roaming\Kodi\addons\plugin.video.example\main.py", line 147, in list_categories
categories = get_categories()
File "C:\Users\User\AppData\Roaming\Kodi\addons\plugin.video.example\main.py", line 113, in get_categories
soup = BeautifulSoup(html, "html.parser")
File "C:\Users\User\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib\BeautifulSoup.py", line 1522, in __init__
BeautifulStoneSoup.__init__(self, *args, **kwargs)
File "C:\Users\User\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib\BeautifulSoup.py", line 1147, in __init__
self._feed(isHTML=isHTML)
File "C:\Users\User\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib\BeautifulSoup.py", line 1189, in _feed
SGMLParser.feed(self, markup)
File "C:\Program Files (x86)\Kodi\system\python\Lib\sgmllib.py", line 104, in feed
self.goahead(0)
File "C:\Program Files (x86)\Kodi\system\python\Lib\sgmllib.py", line 174, in goahead
k = self.parse_declaration(i)
File "C:\Users\User\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib\BeautifulSoup.py", line 1463, in parse_declaration
j = SGMLParser.parse_declaration(self, i)
File "C:\Program Files (x86)\Kodi\system\python\Lib\markupbase.py", line 109, in parse_declaration
self.handle_decl(data)
File "C:\Users\User\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib\BeautifulSoup.py", line 1448, in handle_decl
self._toStringSubclass(data, Declaration)
File "C:\Users\User\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib\BeautifulSoup.py", line 1381, in _toStringSubclass
self.endData(subclass)
File "C:\Users\User\AppData\Roaming\Kodi\addons\script.module.beautifulsoup\lib\BeautifulSoup.py", line 1251, in endData
(not self.parseOnlyThese.text or \
AttributeError: 'str' object has no attribute 'text'
-->End of Python script error report<--
Alles anzeigen
Das versteh ich leider nicht... könnte mir bitte jemand helfen / TIpps geben wie ich weitermachen kann?
Danke,
Linkin