<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Web Comic Downloaders</title>
	<atom:link href="http://blog.afoolishmanifesto.com/archives/770/feed" rel="self" type="application/rss+xml" />
	<link>http://blog.afoolishmanifesto.com/archives/770</link>
	<description>fREWdiculous!</description>
	<lastBuildDate>Fri, 30 Jul 2010 15:07:07 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
	<item>
		<title>By: David W.</title>
		<link>http://blog.afoolishmanifesto.com/archives/770/comment-page-1#comment-167</link>
		<dc:creator>David W.</dc:creator>
		<pubDate>Tue, 02 Jun 2009 16:58:17 +0000</pubDate>
		<guid isPermaLink="false">http://blog.afoolishmanifesto.com/?p=770#comment-167</guid>
		<description>Nice coincidence you should post this right now, as I currently have a terminal window currently scrolling with status messages from my own downloading of a webcomic archive so I can catch up.

Fortunately the one I&#039;m getting now is of the first variety, with a nicely sequential number system. I was able to get this down to a single line in bash:

for i in `seq 1402`; do echo &quot;http://comic.com/images/$i.png&quot; &gt;&gt; list; done &amp;&amp; wget -P images/ -i list -o log -w 8 --random-wait -U &quot;$UserAgent&quot;

The first part creates a file that lists the links to all of the images (note: those are backticks, not quotation marks, around the sequence command -- of course adjust the number there to whatever the most recent # comic is).

The wget parameters are:

-P a prefix to add to the files to save to, in this case an &#039;images&#039; directory
-i imports a list of urls from my &quot;list&quot; file
-o directs all output to log file &quot;log&quot;, so I can scan through for any errors later
-w base the pause time between requests on 8 seconds
--random-wait randomize the pause to between 0.5w and 1.5w (4 and 12 seconds)
-U set the user-agent, in my case this is a variable that gets loaded with bash for just such scripts -- this sets how your request looks in their logs

I&#039;ve done scripts like this in bash-only as well as in python. I love scripting this stuff!</description>
		<content:encoded><![CDATA[<p>Nice coincidence you should post this right now, as I currently have a terminal window currently scrolling with status messages from my own downloading of a webcomic archive so I can catch up.</p>
<p>Fortunately the one I&#8217;m getting now is of the first variety, with a nicely sequential number system. I was able to get this down to a single line in bash:</p>
<p>for i in `seq 1402`; do echo &#8220;http://comic.com/images/$i.png&#8221; &gt;&gt; list; done &amp;&amp; wget -P images/ -i list -o log -w 8 &#8211;random-wait -U &#8220;$UserAgent&#8221;</p>
<p>The first part creates a file that lists the links to all of the images (note: those are backticks, not quotation marks, around the sequence command &#8212; of course adjust the number there to whatever the most recent # comic is).</p>
<p>The wget parameters are:</p>
<p>-P a prefix to add to the files to save to, in this case an &#8216;images&#8217; directory<br />
-i imports a list of urls from my &#8220;list&#8221; file<br />
-o directs all output to log file &#8220;log&#8221;, so I can scan through for any errors later<br />
-w base the pause time between requests on 8 seconds<br />
&#8211;random-wait randomize the pause to between 0.5w and 1.5w (4 and 12 seconds)<br />
-U set the user-agent, in my case this is a variable that gets loaded with bash for just such scripts &#8212; this sets how your request looks in their logs</p>
<p>I&#8217;ve done scripts like this in bash-only as well as in python. I love scripting this stuff!</p>
]]></content:encoded>
	</item>
</channel>
</rss>
