Skip to content

AttributeError crash when resuming wiki dump after stopping during the retrieving titles section #54

@mattsblack

Description

@mattsblack

When I tried to resume my wiki dump after stopping with ctrl+c, it gave me an AttributeError message.
Here is the output from the first run:

wdg --delay 0.5 https://dragonvale.fandom.com/wiki/DragonVale_Wiki --xml --images
Checking API... https://dragonvale.fandom.com/api.php
API is OK:  https://dragonvale.fandom.com/api.php
Checking index.php... https://dragonvale.fandom.com/index.php
check_index(): Trying Special:Random...
POST https://dragonvale.fandom.com/index.php {'title': 'Special:Random'} 301
GET https://dragonvale.fandom.com/wiki/Special:Random {'title': 'Special:Random'} 302
GET https://dragonvale.fandom.com/wiki/Orange_Flame_of_Corruption {'title': 'Special:Random'} 200
index.php available probability: 90% (0.9)
index.php is OK
No --path argument provided. Defaulting to:
  [working_directory]/[domain_prefix]-[date]-wikidump
Which expands to:
  ./dragonvale.fandom.com-20250828-wikidump
Undo monkey patch...
#########################################################################
# Welcome to DumpGenerator 4.4.4 by WikiTeam3 (GPL v3)                  #
# More info at: <https://github.com/saveweb/wikiteam3>                  #
# Copyright (C) 2011-2025 WikiTeam developers                           #
#########################################################################

Analysing https://dragonvale.fandom.com/api.php
No suitable dump found at Internet Archive
Trying generating a new dump into a new directory...
https://dragonvale.fandom.com/api.php

Retrieving the XML for every page from the beginning


Retrieving the XML for every page

Loading page titles from namespaces = all
Excluding titles from namespaces = None
38 namespaces found
    Retrieving titles in the namespace 0
    Retrieving titles in the namespace 1
    Retrieving titles in the namespace 2
    Retrieving titles in the namespace 3
    Retrieving titles in the namespace 4
    Retrieving titles in the namespace 5
    Retrieving titles in the namespace 6
Delay 0.5s: Session delay: wikiteam3.dumpgenerator.api.page_titles ^C.Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/listing.py", line 63, in __next__
    item = next(self._iter)
StopIteration

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.13/bin/wikiteam3dumpgenerator", line 8, in <module>
    sys.exit(main())
             ~~~~^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/__init__.py", line 4, in main
    DumpGenerator()
    ~~~~~~~~~~~~~^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/dump/generator.py", line 86, in __init__
    DumpGenerator.createNewDump(config=config, other=other)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/dump/generator.py", line 112, in createNewDump
    generate_XML_dump(config=config, session=other.session)
    ~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/dump/xmldump/xml_dump.py", line 146, in generate_XML_dump
    doXMLExportDump(config, session, xmlfile, lastPage)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/dump/xmldump/xml_dump.py", line 71, in doXMLExportDump
    for title in read_titles(config, session=session, start=start):
                 ~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/api/page_titles.py", line 266, in read_titles
    getPageTitles(config=config, session=session)
    ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/api/page_titles.py", line 215, in getPageTitles
    for title in titles:
                 ^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/api/page_titles.py", line 47, in getPageTitlesAPI
    for page in site.allpages(namespace=namespace):
                ~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/listing.py", line 188, in __next__
    info = super(GeneratorList, self).__next__()
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/listing.py", line 69, in __next__
    self.load_chunk()
    ~~~~~~~~~~~~~~~^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/listing.py", line 199, in load_chunk
    return super(GeneratorList, self).load_chunk()
           ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/listing.py", line 99, in load_chunk
    data = self.site.get(
        'query', (self.generator, self.list_name),
        *[(str(k), v) for k, v in self.args.items()]
    )
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/client.py", line 300, in get
    return self.api(action, 'GET', *args, **kwargs)
           ~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/client.py", line 358, in api
    info = self.raw_api(action, http_method, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/client.py", line 569, in raw_api
    res = self.raw_call('api', data, retry_on_error=retry_on_error,
                        http_method=http_method)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/mwclient/client.py", line 497, in raw_call
    stream = self.connection.request(http_method, url, **args)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/utils/monkey_patch.py", line 142, in new_send
    Delay(msg=self.delay_msg, config=self.config)
    ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/cli/delay.py", line 57, in __init__
    time.sleep(delay)
    ~~~~~~~~~~^^^^^^^
KeyboardInterrupt
Undo monkey patch...

And here is the output from the second one where I tried resuming:

wdg --delay 0.5 https://dragonvale.fandom.com/wiki/DragonVale_Wiki --xml --images
Checking API... https://dragonvale.fandom.com/api.php
API is OK:  https://dragonvale.fandom.com/api.php
Checking index.php... https://dragonvale.fandom.com/index.php
check_index(): Trying Special:Random...
POST https://dragonvale.fandom.com/index.php {'title': 'Special:Random'} 301
GET https://dragonvale.fandom.com/wiki/Special:Random {'title': 'Special:Random'} 302
GET https://dragonvale.fandom.com/wiki/Valeodon_Dragon/Pedestal {'title': 'Special:Random'} 200
index.php available probability: 90% (0.9)
index.php is OK
No --path argument provided. Defaulting to:
  [working_directory]/[domain_prefix]-[date]-wikidump
Which expands to:
  ./dragonvale.fandom.com-20250828-wikidump
Undo monkey patch...
#########################################################################
# Welcome to DumpGenerator 4.4.4 by WikiTeam3 (GPL v3)                  #
# More info at: <https://github.com/saveweb/wikiteam3>                  #
# Copyright (C) 2011-2025 WikiTeam developers                           #
#########################################################################

Analysing https://dragonvale.fandom.com/api.php

Warning!: "./dragonvale.fandom.com-20250828-wikidump" path exists
There is a dump in "./dragonvale.fandom.com-20250828-wikidump", probably incomplete.
If you choose resume, to avoid conflicts, some parameters you have chosen in the current session will be ignored
and the parameters available in "./dragonvale.fandom.com-20250828-wikidump/config.json" will be loaded.
Do you want to resume (y/n)? y
You have selected: YES
Loading config file to resume...
Resuming previous dump process...
Resuming XML dump from "Main Page" (revision id 2043)
https://dragonvale.fandom.com/api.php
Removing the last chunk of past XML dump: it is probably incomplete.
len(incomplete_segment.encode("utf-8")) returned 3938, while os.path.getsize(filename) returned 3938, so fh.truncate() would be fh.truncate(0), which would be illegal. Something is seriously wrong here!
Adding newline to end of ./dragonvale.fandom.com-20250828-wikidump/dragonvale.fandom.com-20250828-history.xml
WARNING: will try to start the download...

Retrieving the XML for every page

Failed to find title in last trunk XML: b'<mediawiki xmlns="http://www.mediawiki.org/xml/export-0.11/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.mediawiki.org/xml/export-0.11/ http://www.mediawiki.org/xml/export-0.11.xsd" version="0.11" xml:lang="en">\n  <siteinfo>\n    <sitename>DragonVale Wiki</sitename>\n    <dbname>dragonvale</dbname>\n    <base>https://dragonvale.fandom.com/wiki/DragonVale_Wiki</base>\n    <generator>MediaWiki 1.43.1</generator>\n    <case>first-letter</case>\n    <namespaces>\n      <namespace key="-2" case="first-letter">Media</namespace>\n      <namespace key="-1" case="first-letter">Special</namespace>\n      <namespace key="0" case="first-letter"/>\n      <namespace key="1" case="first-letter">Talk</namespace>\n      <namespace key="2" case="first-letter">User</namespace>\n      <namespace key="3" case="first-letter">User talk</namespace>\n      <namespace key="4" case="first-letter">DragonVale Wiki</namespace>\n      <namespace key="5" case="first-letter">DragonVale Wiki talk</namespace>\n      <namespace key="6" case="first-letter">File</namespace>\n      <namespace key="7" case="first-letter">File talk</namespace>\n      <namespace key="8" case="first-letter">MediaWiki</namespace>\n      <namespace key="9" case="first-letter">MediaWiki talk</namespace>\n      <namespace key="10" case="first-letter">Template</namespace>\n      <namespace key="11" case="first-letter">Template talk</namespace>\n      <namespace key="12" case="first-letter">Help</namespace>\n      <namespace key="13" case="first-letter">Help talk</namespace>\n      <namespace key="14" case="first-letter">Category</namespace>\n      <namespace key="15" case="first-letter">Category talk</namespace>\n      <namespace key="110" case="first-letter">Forum</namespace>\n      <namespace key="111" case="first-letter">Forum talk</namespace>\n      <namespace key="112" case="first-letter">Data</namespace>\n      <namespace key="113" case="first-letter">Data talk</namespace>\n      <namespace key="114" case="first-letter">DragonVale World Wiki</namespace>\n      <namespace key="115" case="first-letter">DragonVale World Wiki talk</namespace>\n      <namespace key="420" case="first-letter">GeoJson</namespace>\n      <namespace key="421" case="first-letter">GeoJson talk</namespace>\n      <namespace key="500" case="first-letter">User blog</namespace>\n      <namespace key="501" case="first-letter">User blog comment</namespace>\n      <namespace key="502" case="first-letter">Blog</namespace>\n      <namespace key="503" case="first-letter">Blog talk</namespace>\n      <namespace key="828" case="first-letter">Module</namespace>\n      <namespace key="829" case="first-letter">Module talk</namespace>\n      <namespace key="1200" case="first-letter">Message Wall</namespace>\n      <namespace key="1201" case="first-letter">Thread</namespace>\n      <namespace key="1202" case="first-letter">Message Wall Greeting</namespace>\n      <namespace key="2000" case="first-letter">Board</namespace>\n      <namespace key="2001" case="first-letter">Board Thread</namespace>\n      <namespace key="2002" case="first-letter">Topic</namespace>\n      <namespace key="2900" case="first-letter">Map</namespace>\n      <namespace key="2901" case="first-letter">Map talk</namespace>\n    </namespaces>\n  </siteinfo>\n  <page>\n    <title>Main Page</title>\n    <ns>0</ns>\n    <id>2043</id>\n    <redirect title="DragonVale Wiki"/>\n    <revision>\n      <id>3903</id>\n      <timestamp>2011-09-10T18:44:21Z</timestamp>\n      <contributor>\n        <username>Default</username>\n        <id>49312</id>\n      </contributor>\n      <comment>moved [[Main Page]] to [[DragonVale Wiki]]: SEO</comment>\n      <origin>3903</origin>\n      <model>wikitext</model>\n      <format>text/x-wiki</format>\n      <text bytes="29" sha1="nxiu2vtgfda3mrzza98jtsj2xu9ffo7" xml:space="preserve">#REDIRECT [[DragonVale Wiki]]</text>\n      <sha1>nxiu2vtgfda3mrzza98jtsj2xu9ffo7</sha1>\n    </revision>\n  </page>\n</mediawiki>'
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.13/bin/wikiteam3dumpgenerator", line 8, in <module>
    sys.exit(main())
             ~~~~^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/__init__.py", line 4, in main
    DumpGenerator()
    ~~~~~~~~~~~~~^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/dump/generator.py", line 84, in __init__
    DumpGenerator.resumePreviousDump(config=config, other=other)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/dump/generator.py", line 169, in resumePreviousDump
    generate_XML_dump(
    ~~~~~~~~~~~~~~~~~^
        config=config,
        ^^^^^^^^^^^^^^
        session=other.session,
        ^^^^^^^^^^^^^^^^^^^^^^
        resume=True,
        ^^^^^^^^^^^^
    )
    ^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/dump/xmldump/xml_dump.py", line 146, in generate_XML_dump
    doXMLExportDump(config, session, xmlfile, lastPage)
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Library/Frameworks/Python.framework/Versions/3.13/lib/python3.13/site-packages/wikiteam3/dumpgenerator/dump/xmldump/xml_dump.py", line 62, in doXMLExportDump
    start = lastPage.find('title').text
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'text'

I think it's because it wasn't able to get all the titles in the first run.
Also unrelated to this, when I try to resume a dump with --resume --path and set a delay to a different value, the new delay value doesn't override the old one.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions