Are there any existing plaintext file format to for storing discussion forums posts like this one or ubuntuforums. I want to archive the discussion I like locally. i have been using singlefilez for download the whole page into my machine, but i prefer plaintext formats. When I tried Org-web-tools, it does not seem to properly extract reddit discussion pages for example.
I suppose I can write a scraper and dump content in json format. I’d prefer a plaintext format like org-mode and was designed with some thought put into this, instead of me cobbling something together.
mbox would be perfect. You can use Gnus or rmail to view them.
There is nnreddit.
There is also the RSS feed (add “.rss” to a subreddit’s url). But that only has the posts, not the comments.
I suppose I can write a scraper and dump content in json format.
No need, reddit already provides their data in JSON form. Generally just append .json at the end of the URL and you get your JSON, for example
https://www.reddit.com/r/emacs/comments/17u00j0/extracting_forums_posts_like_reddit_discussions/
->
https://www.reddit.com/r/emacs/comments/17u00j0/extracting_forums_posts_like_reddit_discussions.json