Mailing List Archive Howto

From wiki.occupyboston.org
Jump to: navigation, search

Archiving mailing list archives

Download a copy of http://archive.occupyboston.org/mailing-lists/ob-archive-list.

Create a new directory for your archiving work; if something goes wrong, this avoids the possibility of having (say) thousands of html files littering your archive directory.

 mkdir work
 cp ob-archive-list work
 cd work

Now, let's get down to work. There are two cases to consider

  1. public archives (those accessible without a password), and
  2. non-public archives (which require a username and password to access)

public archives

Here's an example of archiving a public mailing list. You'll run ob-archive-list with the list's archive URL. For this example, the archive URL is https://lists.mayfirst.org/pipermail/fawg/.

$ sh ob-archive-list https://lists.mayfirst.org/pipermail/fawg/
ob-archive-list: downloading https://lists.mayfirst.org/pipermail/fawg/
2013-03-31 19:54:24 URL:https://lists.mayfirst.org/pipermail/fawg/ [1650/1650] -> "fawg/index.html" [1]
2013-03-31 19:54:24 URL:https://lists.mayfirst.org/pipermail/fawg/search [5925] -> "fawg/search" [1]
2013-03-31 19:54:24 URL:https://lists.mayfirst.org/pipermail/fawg/2013-March/thread.html [10065/10065] -> "fawg/2013-March/thread.html" [1]
2013-03-31 19:54:24 URL:https://lists.mayfirst.org/pipermail/fawg/2013-March/subject.html [7897/7897] -> "fawg/2013-March/subject.html" [1]
2013-03-31 19:54:24 URL:https://lists.mayfirst.org/pipermail/fawg/2013-March/author.html [7899/7899] -> "fawg/2013-March/author.html" [1]
   ...
2013-03-31 19:54:31 URL:http://lists.mayfirst.org/pipermail/fawg/attachments/20130227/d5702cea/attachment.xls [12800/12800] -> "fawg/attachments/20130227/d5702cea/attachment.xls" [1]
2013-03-31 19:54:31 URL:http://lists.mayfirst.org/pipermail/fawg/attachments/20130227/d5702cea/attachment.pgp [198/198] -> "fawg/attachments/20130227/d5702cea/attachment.pgp" [1]
FINISHED --2013-03-31 19:54:31--
Total wall clock time: 6.9s
Downloaded: 187 files, 901K in 1.0s (929 KB/s)
ob-archive-list: downloaded 187 file(s) from https://lists.mayfirst.org/pipermail/fawg/
ob-archive-list: creating fawg.tgz
ob-archive-list: uploading fawg.tgz => obarchive@archive.occupyboston.org:archive.occupyboston.org/web/mailing-lists/
X11 forwarding request failed on channel 0
fawg.tgz
      233700 100%   47.91MB/s    0:00:00 (xfer#1, to-check=0/1)

sent 233800 bytes  received 31 bytes  93532.40 bytes/sec
total size is 233700  speedup is 1.00
ob-archive-list: DONE

Afterwards, you should see a .tgz of the mailing list archives in http://archive.occupyboston.org/mailing-lists/.

non-public archives

Non public archives require a little more work. You'll need to

  • View the archive URL in your web browser (which requires authentication)
  • Copy mailman's cookie
  • Provide that cookie as a second argument to ob-archive-list

I'll used https://lists.mayfirst.org/mailman/private/everyone-submit/ for illustration.

Here's an example of a mailman cookie. I've blacked out portions of the cookie name and Content (aka, the cookie value), but those are the pieces you need. Firefox allows you to select and copy the values, and that's what you'll need to do.

Mailman-cookie.png

Here's an example of archiving a non-public list. Note that the second argument to ob-archive-list is the cookie Name=Content.

$ sh ob-archive-list https://lists.mayfirst.org/mailman/private/everyone-submit/ everyone-submit+user+jsmith--at--example.org=28020000006948c558517328000000363464363831643863643364000000000000000000061653964366165623239316462663439333630
ob-archive-list: downloading https://lists.mayfirst.org/mailman/private/everyone-submit/
2013-03-31 20:16:38 URL:https://lists.mayfirst.org/mailman/private/everyone-submit/ [4289] -> "everyone-submit/index.html" [1]
2013-03-31 20:16:38 URL:https://lists.mayfirst.org/mailman/private/everyone-submit/search [5964] -> "everyone-submit/search" [1]
  ...
2013-03-31 20:16:57 URL:https://lists.mayfirst.org/mailman/private/everyone-submit/attachments/20120827/d74bc7d2/attachment.htm [6451] -> "everyone-submit/attachments/20120827/d74bc7d2/attachment.htm" [1]
2013-03-31 20:16:57 URL:https://lists.mayfirst.org/mailman/private/everyone-submit/attachments/20120830/c68b69b6/attachment.htm [44740] -> "everyone-submit/attachments/20120830/c68b69b6/attachment.htm" [1]
FINISHED --2013-03-31 20:16:57--
Total wall clock time: 19s
Downloaded: 92 files, 1.9M in 3.3s (579 KB/s)
ob-archive-list: downloaded 92 file(s) from https://lists.mayfirst.org/mailman/private/everyone-submit/
ob-archive-list: creating everyone-submit.tgz
ob-archive-list: uploading everyone-submit.tgz => obarchive@archive.occupyboston.org:archive.occupyboston.org/web/mailing-lists/
X11 forwarding request failed on channel 0
everyone-submit.tgz
     1483299 100%   43.23MB/s    0:00:00 (xfer#1, to-check=0/1)

sent 1290846 bytes  received 7351 bytes  288488.22 bytes/sec
total size is 1483299  speedup is 1.14
ob-archive-list: DONE

Afterwards, look at http://archive.occupyboston.org/mailing-lists/, to make sure the .tgz uploaded correctly

Mailing list deletion

After you've uploaded a mailing list archive, it's worth taking a moment to make sure the archive uploaded correctly. For example, after archiving LIST, do "ls -lh LIST.tgz". Compare the size of your local copy with the one on http://archive.occupyboston.org/mailing-lists.

Once you're sure the archive copy is good, there is one remaining step: deleting the mailing list.

Delete is permanent, so be careful, and make sure you're deleting the right list!