How To Collect Web-based Technical Content

April 29, 2013
Taking Internet content offline may be easy if you want to keep it to yourself but what happens when you want to share it. An ebook might be the answer.

Reading technical content like the articles we have on Electronic Design is easy if you have an Internet connection and a browser. You can even read it on a smartphone using the Electronic Design mobile website. I use it all the time since half the time I only have my smartphone with me.

The problem with web browsing is that it requires an Internet connection and it is not a great way to archive information. I have a large number of bookmarks and have turned to bookmark management tools. The problem is

I have saved HTML files before and most browsers have an offline mode. This is handy because you can view content when not online. The offline content can be viewed with the device you recorded it on and sometimes synchronization options allow it to be viewed on other devices.

The challenge is portability. For example, Apple’s Safari has an offline capability but getting something over to a Kindle is a completely different exercise. The same is true for most Windows-based browsers.

What I really want are ebooks because they are more portable. Actually, most ebook formats are built around HTML. They just bundle the HTML files into a single file. There are other formats that do this. For example, MHTML or MIME HTML is used for web page archives by Internet Explorer and Opera. The main difference is that MHTML is designed to replicate a single page while an ebook is a collection of multiple pages. The advantage of MHTML is that programs that can generate the files can also view them.

You can print our articles and in some cases you have access to PDF versions of articles like Robotics Moves To The Mainstream. There is a link at the start of the article for the PDF. The PDF is from our print version of the magazine and it looks much better than a PDF based on the online content. The photos and the text are the same but there is that level of artistic quality that a full page print magazine has that has yet to be replicated online. You can also subscribe to the PDF version of the magazine.

Getting content in a form you can use is often a challenge. PDFs are good in many instances but they are really poor for reading on a smartphone or phablet. Full page PDFs are fair on small 7-in tablet and just fine on larger screens. Still PDFs tend to lack the viewing flexibility of e-books and even HTML files although the latter is a pain to move if graphics or multiple pages are involved.

One tool I have found very useful is an add-on for Firefox called GrabMyBooks (Fig. 1). The docs can be found at the GrabMyBooks website. It can essentially turn an article on a webpage into a Kindle (.MOBI) or .EPUB  e-book complete with images. It has editing capabilities and you can even grab from multiple open browser tabs. It is even possible to append content while you are browsing.

Figure 1. GrabMyBooks is a Firefox plug-in that can turn a webpage into an EPUB and Kindle ebook.

I use GrabMyBooks when doing online research. I have changed the second title to be the URL of the original content so it is easy to find where it came from. I do use the tool to generate content I can read on my smartphone or tablet but it is useful even on a PC with a large screen because it easily collects information together. I use Sigil to edit EPUBs.

The website dotEPUB is another alternative. It would work with any browser.

I still prefer to have my virtual library rather than an extensive bookmark collection. I know my backups are not going away whereas stuff on the web may change at any time. There are even some sites that discard content like they would old newspapers.

So turn this blog into an ebook and check out GrabMyBooks RSS feed support too.

About the Author

William Wong Blog | Senior Content Director

Bill Wong covers Digital, Embedded, Systems and Software topics at Electronic Design. He writes a number of columns, including Lab Bench and alt.embedded, plus Bill's Workbench hands-on column. Bill is a Georgia Tech alumni with a B.S in Electrical Engineering and a master's degree in computer science for Rutgers, The State University of New Jersey.

He has written a dozen books and was the first Director of PC Labs at PC Magazine. He has worked in the computer and publication industry for almost 40 years and has been with Electronic Design since 2000. He helps run the Mercer Science and Engineering Fair in Mercer County, NJ.

Sponsored Recommendations

Comments

To join the conversation, and become an exclusive member of Electronic Design, create an account today!