APLawrence - Information and Resources for Unix and Linux Systems, Bloggers and the self-employed
RSS Feeds Get APLawrence.com by RSS













Unix and Linux Help, Resources and information for Unix/Linux, Mac OS X. Articles on blogging, web site mechanics, and self employment. Mostly techy, Unix/Linux related, but we don't really try to stay tightly focused. If you've never been here before, there's a lot to explore.


Main Index



Has someone responded to something you wrote or commented on?
Latest Reader Comments Sat Nov 21 17:46:54 2009

cartoon

Kerio Connect 7 Beta

2009/11/19

Kerio is renaming its mailserver to "Kerio Connect". My initial reaction is that changing "Mailserver" to "Connect" is a bad idea. They feel that because the product is so much more than a mailserver, changing the name is appropriate.

I do partly agree. Many of my customers only use the email features and ignore all of the collaboration aspects. Sometimes that's because they see no need for scheduling, etc. and sometimes it's because they just never noticed those capabilities (someone only using POP or IMAP, for example).

Well, regardless of how I feel about it, the next release will be Kerio Connect. It's in beta now and looks solid enough that its release is likely not very far way, so let's take a look at what it offers.

Message Retention

A few releases back Kerio introduced scheduled administrative deletion of Junk and Deleted Items messages. With Kerio Connect 7, this has been extended to include all folders (except Contacts, for obvious reasons). You can set a policy that automatically deletes all messages domain-wide after so many years.

I don't see this as particularly useful for most companies. First, it's only domain-wide - at least right now, this hasn't been brought to individual users as the control of Junk and Deleted is now. Some users may need to keep some email (or particularly specific email folders) forever. I think this needs more fine-grained control to be useful.

Distributed Domains

One of the first features I noticed is distributed domains. From the manual:

If your company uses more Kerio Connect servers physically scattered (located in different cities, countries, continents), you can now add them to a cluster and move all users across all servers involved into a single email domain (distributed domain).

Note that this isn't load balancing. One server is the master point where all incoming email arrives; it is responsible for relaying any that belong at a satellite server. The "slave" servers should have the master set as their relay also if you want single-point archiving and backup.

The "Master /Slave" designation is arbitrary. All servers are really peer to peer and use the same directory service. You determine which servers mailboxes belong on which server and which is the master. Obviously that would need to be the server that is set as the MX for the domain also.

Message Submission service

You'll find this in Services, set to run on port 587. Kerio suggests using this to get around the problem of outgoing port 25 being blocked at hotels and public access points. The user sets his outgoing SMTP port to 587 and the Kerio server listens on that. As this service requires authentication, it can't be used by spammers - unless they've hacked the user's account, of course, but at least we do then absolutely know the source of the spam!

The Message Submission service is defined in RFC 2476 and has much more to do with mail architecture than just bypassing blocked ports. This FAQ: SMTP Message Submission to Proposed Standard describes the reasoning behind the RFC.

Performance improvements

The release notes say:

Kerio Connect uses a more efficient file access method to the message store data. This includes the properties.fld database access and listing mailbox folders.

That doesn't tell us much, does it?

The "properties.fld" file is apparently IMAP annotation data. It's interesting to look at what these metadata files are:

index.fld: ASCII text, with CRLF line terminators properties.fld: Berkeley DB 1.85/1.86 (Btree, version 3, little-endian) search.fld: SQLite 2.x database status.fld: ASCII text, with CRLF line terminators

But that still doesn't tell us how any of this helps speed up file access.

Kerio has a problem with very large mailbox folders. They store individual mail messages rather than packing them all into one database as Exchange does. I think that's the right approach, but it can cause performance issues. These files are designed to help that by providing a mix: database files pointing to individual messages. Apparently this release has made some changes in this area; we'll see if it helps very large folders.

I'd much prefer to see domain or user controlled folder archiving. I'm not referring to the archiving found in Archiving and Backup but rather moving older messages into another folder at regular intervals. For example, if this were set for monthly archiving, everything in your Inbox from last month would be moved to Inbox-2009-09. But that's not a feature of this release and may never be.

Web Administration

With Kerio Connect 7, you can now do all administrative functions through your web browser. For example, on the server itself, I can connect to http://localhost/admin.

This actually runs on port 4040 and there's no control over that in Administration. However, it is listed in mailserver.cfg, so you could adjust this as necessary.

Speaking of config files, there's a new cluster.cfg file. I assume that is for the distributed domains mentioned earlier, but there is also an undocumented Cluster section in the mailserver.cfg, so bigger plans may be afoot. That's pure speculation, of course.

The web administration is very useful - it's not that it's at all difficult to download the free administrative console, but having this available from any web browser is handy,

Additional changes

The release notes mention over-the-air synchronization of HTC Hero mobile devices and that the IMAP server has been improved for better support of multi-session IMAP connections. You'll be able to rename a domain also. That seems to be about it.

/Kerio/connect-beta.html copyright and reprint notice

Comments /Kerio/connect-beta.html

Add your comments




cartoon

Freeing disk space with ">"

2009/11/18

I wrote this up after a forum discussion in which several posters didn't really understand why ">" can free disk space when "rm" cannot. The basic problem is that if another process has a file open (for reading or writing, it doesn't matter), the disk blocks are not freed by an "rm" until the process or process using the file quits (or stops using the file, at least). That part seems to be well understood.

What is perhaps more difficult to understand is why a simple ">" CAN free up the bytes that "rm' cannot.

For those very new to Unix/Linux: if you are using almost any shell but "csh", a ">" followed by a file name will empty that file. That is, it will be truncated to zero bytes without changing the ownership or permissions. On more recent Linux machines, you may have a "truncate" command that will do the same thing.

For those NOT new to Unix or Linux, this article isn't meant for you. Unfortunately, it was linked to from some places frequented by more advanced users. Feel free to read it, of course, but it's not going to tell you anything you do not already know.

To show that, we need to write a little code. I'll use Perl for that, but if you don't grok Perl, don't worry -I'll explain it as we go along. I did this on a Mac, but you'd see the same thing on Linux or BSD.

Let's start with the "rm" issue. Our Perl code will just open a file and loop. We'll run that in one Terminal window and do everything else in another.

#!/usr/bin/perl open(I,"./t"); while (1) { sleep 10; print "I'm still here"; }

The script just opens "t" and then loops. It never reads or writes anything, but it does have "t" open while running.

The file "t" already exists before running this and is large enough to notice its absence in "df". Here it is before the script runs and while it runs:.

$ ls -l t;df | head -2 -rw-r--r-- 1 apl apl 20480 Nov 18 08:50 t Filesystem 512-blocks Used Available Capacity Mounted on /dev/disk0s2 155629664 99862360 55255304 65% / # start the script in another window $ ls -l t; df | head -2 -rw-r--r-- 1 apl apl 20480 Nov 18 10:11 t Filesystem 512-blocks Used Available Capacity Mounted on /dev/disk0s2 155629664 99862360 55255304 65% /

If we now remove "t", nothing will change:

$ rm t $ ls -l t; df | head -2 ls: t: No such file or directory Filesystem 512-blocks Used Available Capacity Mounted /dev/disk0s2 155629664 99862360 55255304 65% /

When we interrupt the script, disk space is reclaimed:

$ ls -l t; df | head -2 ls: t: No such file or directory Filesystem 512-blocks Used Available Capacity Mounted on /dev/disk0s2 155629664 99862320 55255344 65% /

However, if we do the same thing with ">", diskspace will be reclaimed instantly:

$ ls -l t;df | head -2 -rw-r--r-- 1 apl apl 20480 Nov 18 09:20 t Filesystem 512-blocks Used Available Capacity Mounted on /dev/disk0s2 155629664 100006784 55110880 65% / $> t $ ls -l t;df | head -2 -rw-r--r-- 1 apl apl 0 Nov 18 09:21 t Filesystem 512-blocks Used Available Capacity Mounted on /dev/disk0s2 155629664 100006744 55110920 65% /

I demonstrated similar code at the forum and quickly got back this comment:

The > trick will simply remove the data in the file and have no need for the os to clear up unused data so this might work unless the process remembers where in the file it is appending too and always does a seek.

Let's see if that's true. We'll need a different script:

#!/usr/bin/perl open(I,">./t"); while (1) { print I "x" x 4096; print "Block written\n"; sleep 10; }

This time, the script writes 4096 bytes (4096 "x"'s) on every loop. I'm not going to bother to show the listings and df's; the behavior is exactly the same: the bytes are freed as soon as you do "> t".

But our doubting poster mentioned "seek". For those who do not know, a seek moves the writing or reading position of the file to a specific place. We can do that with Perl:

#!/usr/bin/perl open(I,">./t"); $x=0; while (1) { $mypos=$x * 4096; seek I, 0, $mypos; print I "x" x 4096; print "Block $x written\n"; $x++; sleep 10; }

This doesn't do anything different than the previous script, it just does it another way. Instead of just writing bytes, it specifically positions itself before writing. If nothing else is happening with "t", no different outcome is expected.

What happens when we do "> t" while that puppy is running?

$ ls -l t;df | head -2 -rw-r--r-- 1 apl apl 20480 Nov 18 09:20 t Filesystem 512-blocks Used Available Capacity Mounted on /dev/disk0s2 155629664 100006784 55110880 65% / $ > t $ ls -l t;df | head -2 -rw-r--r-- 1 apl apl 0 Nov 18 08:36 t Filesystem 512-blocks Used Available Capacity Mounted on /dev/disk0s2 155629664 100006176 55111488 65% /

Instant reclaim. But on the next write from the script, the file's size is right back up:

$ ls -l t;df | head -2 -rw-r--r-- 1 apl apl 20480 Nov 18 08:36 t Filesystem 512-blocks Used Available Capacity Mounted on /dev/disk0s2 155629664 100006216 55111448 65% /

But - and this is the important part - notice that the available disk space did NOT change!

What happens here is that the file goes to zero and available space increases, but then when the writer writes again, it's back to a large size instantly. That's because of the "seek" - the bytes were written at a specific position. But the available space is NOT back to what it was, and "od" shows why:

$ od -c t 0000000 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 * 0040000 x x x x x x x x x x x x x x x x *

If you had looked at "t" before the ">", it would have looked like this:

$ od -c t 0000000 x x x x x x x x x x x x x x x x *

"od" shows repeated bytes by an '*" - since we are writing nothing but "x", there's no need to show more. After the ">", "od" shows "0" in all of the bytes up to the subsequent write by the script. Those nul bytes aren't really there - this is a "sparse" file. It was created by the absolute seeks of the perl script after "> t" had emptied the file.

If you didn't understand this, I encourage you to play with these scripts on your own system. To avoid confusion, the system should be relatively "quiet" while you do this - I couldn't control that absolulutely here so some figures shown by "df" are different than you might expect - that's because I had a few other things going on while doing this. You still should be able to see that ">" really does reclaim space.

The lesson is this: if another process has a file open, use ">" to reclaim the disk space. If the process is doing absolute seeks, you may not be able to tell with "ls -l" that the space has been reclaimed, but it will be.

/Unixart/freeing-disk-space.html copyright and reprint notice

Comments /Unixart/freeing-disk-space.html

Fri Nov 20 07:09:31 2009 anonymous

boah, what an overly complicated approach this is when it is done with Perl. When you do this directly in a shell, a 'cat /dev/null 2>&1 > file' is enough to cut down a file to zero, while no filehandles are lost. This could in practice be used to reclaim the diskspace a large file (for instance a logfile, that is permanently opened by a daemonized process) is blocking, without having to restart the daemon.

Happy diskspace reclaiming

Fri Nov 20 08:10:18 2009 dutchkind

Interesting post! I don't think the anonymous commenter read well, the perl script was, I think, only to show the > t was working.

Fri Nov 20 11:03:40 2009 TonyLawrence

what an overly complicated approach

Another example of someone scanning a page at breakneck speed with no comprehension :-)



Fri Nov 20 13:11:10 2009 anonymous

To be fair to the other poster, this article is long-winded, needlessly complicated and doesn't really answer the question it poses at the start. I can easily imagine a beginner being baffled by this exposition.

">" doesn't "free disk space". It truncates files (in this instance). When you remove data from a file, the space it used is reclaimed, even if a program is reading/writing to the file. When you remove the file itself, it's not reclaimed until no programs are reading/writing. That's the difference, it's not hard and doesn't need perl scripts or hex dumps to demonstrate.

If you'd used the actual "truncate" command it would also have been much clearer, and then you could have pointed out at the end that an empty redirect effectively performs a truncation.

Fri Nov 20 13:28:22 2009 TonyLawrence

I'm not aware of any "truncate" command. There is a "truncate" library call - is that what you mean?

If so, are you suggesting that it would be better to write a C program for this explanation than to use ">" ????

Whether YOU think "it's not hard and doesn't need perl scripts or hex dumps to demonstrate. " isn't the point. *I* had people arguing with me, insisting that truncating the file with ">" wouldn't help free disk space when the file was open for writing by another process.

Now - perhaps you are annoyed because you were sent here from the Deveoper section of Linux Today? I agree - this article never should have been put there. That's not my fault - I don't submit posts to LT. This article was written for people who DON'T understand what happens when a file is truncated.

If you take a look to at the top left, you'll notice this is tagged as "Basics". That means it's an introductory article written for relative newbies who DO need examples, hex dumps, and long winded explanations. Again - not my fault LT put a link to this in a more advanced section; I cannot control that.

Sheesh :-)






Fri Nov 20 13:54:36 2009 TonyLawrence

I almost wish other sites like that wouldn't pick up these posts.

Yeah, we pick up a few regulars now and then, but mostly it's people who scan quickly looking for something to bitch about. Usually Linux Today readers aren't like that, but when something gets mis-categorized like this (it was linked to from the Developers section), I suppose it's natural for people to get upset.









Fri Nov 20 14:06:17 2009 anonymous

"I'm not aware of any "truncate" command. There is a "truncate" library call - is that what you mean?"

No, I mean the truncate command. It's part of coreutils. I wouldn't comment on Perl scripts and then suggest writing in C....

As for being tagged with basics, that's not the point. I'm commenting on the comprehensibility of the material. It wasn't until my post that you even said that > "truncates" - you didn't explain what it was doing _at all_.

Worse, in your Perl script, you moved from opening "./t" to opening ">./t". On an article about > the shell redirector, and talking about running '> t', that's _really_ confusing.

I'm sorry you're not too happy receiving critiques of your posting, but if people are coming here from other sites I think they deserve a clearer explanation of what you're trying to get across. If you don't want comments don't have a comment box, and if you don't want other people linking to your articles then don't publish them. Otherwise, why not just try to learn from the experience? You could have written this in 1/3rd the space and made it 3 times clearer.

Fri Nov 20 14:30:36 2009 TonyLawrence

For one, you are being Linux centric. This article references Mac OS X and other Unix systems.

"truncate" is relatively new to coreutils - added in the last year or so. For example, I find no "truncate" on my older Ubuntu. Apt-get says it knows of no such thing and that coreutils is up to date. Obviously that isn't true, but at this point I don't think it's reasonable to depend on that even if we were just talking about Linux.

As to the rest, well, I disagree. The ">" in the Perl scripts have nothing to do with anything. If you find that confusing, well, I'm sorry. But you are entitled to your opinion.

I don't mind critiques. I mind stupidity like the first comment. Your opinion that this is confusing and too long is fine. Again, I don't agree, but I agree that you can feel that way.




Fri Nov 20 14:37:29 2009 TonyLawrence

I almost forgot:

If you feel you can do better (and I do NOT mean that in the sarcastic way it is often used), I wish you would. I'd be extremely happy to link to your article or to publish it here ( http://aplawrence.com/publish.html ).

The intent of this site has always been to HELP people. I believe that there are always different ways to explain something and that one person will find one way more helpful than another. So, if you or anyone else wants to improve on this, please do.

Fri Nov 20 15:24:10 2009 anonymous

"If you feel you can do better (and I do NOT mean that in the sarcastic way it is often used), I wish you would."

I already did, that's why I commented here rather than writing an article on another site.

I don't imagine many beginners will follow your explanation, shell scripts or not, and even if they agree it works as you say I don't think they would understand why solely from this article. That's why I'm offering comments.

I'll also offer a corrected Perl script, just in case people do actually want to try this.

#!/usr/bin/perl
open(I,">./t");
$x=0;

while (1) {
$mypos=$x * 4096;
seek I, $mypos, 0;
print I "x" x 4096;
print "Block $x written\n";
$x++;

sleep 10;
}

You're welcome.


Fri Nov 20 15:38:07 2009 TionyLawrence

I'm just going to smile politely and say sure, whatever you say. Thanks for visiting and have a lovely day. It's Friday, it's raining here but balmy warm and I'm sure we both have better things to do.



Fri Nov 20 16:35:51 2009 BigDumbDInosaur

http://bcstechnology.net

cat /dev/null 2>&1 > file

Did I just read a complaint about the article content being "too complicated?"

Also, who's that Tiony guy that's passing himself off as A.P. Lawrence? <Grin>

Fri Nov 20 16:41:03 2009 TonyLawrence

I saw that Tiony guy earlier today in the bathroom when I was shaving. He looks dangerous.

Fri Nov 20 16:52:59 2009 TonyLawrence

But seriously:

The guy complaining thinks I should have used "truncate". I disagree because that's relatively new and is Linux centric. But he's right that I should have fully explained that ">" truncates a file. I didn't think about that while writing this because of the context - I was writing for people who knew what ">" does but didn't think it would free up disk space if the file were in use.

I'm going to add a paragraph up there to correct that oversight. I already removed the gratuitous "}" that snuck itself into one Perl script earlier (which that same reader also noticed).

As to being overly long and using unnecessary examples, I make no apology. The people at that forum needed specific examples.



Fri Nov 20 20:19:33 2009 Open with O_TRUNC frees blocks anonymous

I really doubt that either bash or csh uses ftruncate. Instead, they are probably using the O_TRUNC flag to open, which is part of POSIX and thus quite portable. From man 2 open:

O_TRUNC: If the file already exists and is a regular file and the open mode allows writing (i.e., is O_RDWR or O_WRONLY) it will be truncated to length 0. If the file is a FIFO or terminal device file, the O_TRUNC flag is ignored. Otherwise the effect of O_TRUNC is unspecified.


Fri Nov 20 20:23:14 2009 TonyLawrence

OK.

And your point is what?

Sat Nov 21 17:32:43 2009 anonymous

Some files should not be truncated though, but just removed, as they may still be in use (and I mean something reading from it, like a kernel mount on a loopback-mounted ISO file).

Sat Nov 21 17:46:54 2009 TonyLawrence

Some files should not be truncated though

Yes, good point - if mounted, you definitely would not want to truncate without unmounting.

Add your comments





cartoon

The Art of SEO


2009/11/17

Index by Subject

  • The Art of SEO
  • Enge, Spencer, Fishkin, Stricchiola
  • 0596518862


I hope this book helps put an end to ridiculously priced SEO "courses". It could also help eliminate some of the shadier practitioners of SEO, but I'd be happy if it just helps a few people avoid wasting money on nonsense.

This book isn't nonsense. It's a complete exposition of the state of SEO as it stands today. More than 500 pages cover everything you could ever hope to know about optimizing your website for search engines.

I've said before that it's unnecessary to pay for expensive courses to learn SEO because all of it, every single thing you'd ever need to know, is available on the Internet for free. That's still true, though of course the problem is that you have to find it and learn to discern good advice from bad, outdated concepts from those that reflect current reality. This book pulls everything together in 500 pages or so,

Of course it's not for everyone. It assumes some technical knowledge or at least access to someone with that knowledge. As an example, at several points 301 redirects are used as the solution certain SEO issues. Although some minor direction is given about implementing these, you'd need more than is provided here if you were a neophyte - someone using Blogger or Wordpress.com isn't going to become proficient with Apache from reading this.

The only small complaint I can make is that sometimes the authors use too many examples, especially for the more basic concepts at the beginning of the book. However, too many is far better than too few, so I won't complain too much. I definitely can't complain that they left anything out: this could serve as a course book for a SEO class.

So, once more: don't waste your money on expensive "courses" from self-styled Internet Gurus. Buy this instead. If you are already reasonably proficient in this area, this book can serve as a reference and refresher. Either way, I strongly recommend it.


Tony Lawrence 2009-05-03 Rating: 4.5

graphic of book cover Order (or just read more about) The Art of SEO  from Amazon.com. Yes, I earn a small referral fee if you use that link to purchase the book.

/Books/art-of-seo.html copyright and reprint notice

Comments /Books/art-of-seo.html

Add your comments




cartoon

Sort -u vs. uniq

2009/11/15

I have sometimes seen people use a pipeline that includes "sort | uniq". The result of that is no different than just adding a -u flag to sort and absolutely requires more time and processing power - not that it usually matters; unless the input is humongously long, you'd need to run them through "time" to spot any difference. So why use "uniq"?

For cases like that, where there is no difference in the output, it's probably just habit - you may be accustomed to using "uniq" for other jobs and just reach for it automatically. I'll argue that it's a good habit to have: if you are in the habit of using "sort -u", you may tend to forget about "uniq" and that could cause you do do something much more difficult and clumsy when a job needs something that "uniq" does well.

However, it's also true that "sort" has tricks that "uniq" lacks, so if you only know about "uniq", you again could make your life more difficult.

One of the helpful abilities that "sort" has is the ability to specify the field separator. Let's take a sample file:

A:B:C:D a:b:c:d t:b:c:d t:b:c:d a:b:c:d a:b:x:d foo:b:x:d f:a:x:d

If all we cared about was removing duplicatelines, we could use "sort -u file" or "sort file | uniq". But what if we want to sort by the second field?

We can do that directly with "sort -t: -k 2 -u", but it's much harder to do with "uniq" because you can't tell it a separator character. You can get around that partially with "tr" or "sed", translating ":"'s to spaces or tabs, but that's clumsy. Even after translating, "uniq" only lets you skip fields, so you don't get quite the same output:

$ sort -t: -k 2 -u file A:B:C:D f:a:x:d a:b:c:d a:b:x:d $ cat file | tr ":" " " | sort | uniq -f1 A B C D a b c d a b x d f a x d foo b x d t b c d

We could argue about which output truly represents unique lines when sorted on field 2, but the point to understand is that skipping fields isn't the same as what "sort" does.

You can also lock down fields with "sort" :

$ sort -t: -u -k2,2 file A:B:C:D f:a:x:d a:b:c:d

As "uniq" can only skip fields and can't anchor to one field only, it's much harder to get these results. However, "uniq" again has tricks that "sort" can't do: it can skip a specific number of characters in addition to skipping fields. It can also give you only the unique lines or only the lines that were repeated:

$ sort file | uniq -u # only the unique, non-repeated lines A:B:C:D a:b:x:d f:a:x:d foo:b:x:d $ sort file | uniq -d # repeated lines a:b:c:d t:b:c:d

Either of those is extremely convoluted without "uniq", and the need for one or the other does come up surprisingly often.

Somebody thought that we could use "sort" and "uniq" in one program: Sortu is the result.

The sortu program is a replacement for the sort and uniq programs. It is common for Unix script writers to want to count how many separate patterns are in a file. For example, if you have a list of addresses, you may want to see how many are from each state. So you cut out the state part, sort these, and then pass them through uniq -c. Sortu does all this for you in a fraction of the time.

I think by the time I figured out how to use "sortu" I could have already done the job another way, but you might find it interesting anyway.

I think the important thing is to realize that "sort" and "uniq" have both conflicting and complementary abilities. Don't tie yourself in pipeline knots with either of them; learn to use each of them appropriately and your scripts will be easier.

/Unixart/sort-vs-uniq.html copyright and reprint notice

Comments /Unixart/sort-vs-uniq.html

Mon Nov 16 01:38:49 2009 ScottCarpenter

http://www.movingtofreedom.org

I learn and forget about uniq occasionally. sort -u usually does the job for me, but the other day I wanted to count how many occurrences there were of each thing, which led me to rediscovering uniq -c in concert with sort.

Mon Nov 16 04:35:21 2009 anonymous

Are there any versions of sort out there that do not have the -u option? I'm wondering if the reason some people use sort | uniq is portability...

Mon Nov 16 11:11:57 2009 TonyLawrence

Are there any versions of sort out there that do not have the -u option?

I son't think so - it's had that ever since I can remember.

Mon Nov 16 13:41:54 2009 BigDumbDinosaur

http://bcstechnology.net

sortu is the sort (ouch!) of creeping featurism that seems to be taking over in present day Linux distributions. We should keep that sort of programming in the Windows camp and stick to the UNIX philosophy of making small tools that do a small group of related tasks well.

As for uniq, I know about it but for some reason have never found a good reason to use it. I guess sort and pipes are too entrenched in my mind.

Mon Nov 16 20:04:39 2009 TonyLawrence

But "uniq" can do things sort -u cannot...

Add your comments





cartoon

LiveUSB OpenBSD project at sourceforge

2009/11/14 Girish Venkatachalam


Girish Venkatachalam is a UNIX hacker with more than a decade of networking and crypto programming experience. His hobbies include yoga,cycling, cooking and he runs his own business. Details here:

http://gayatri-hitech.com
http://spam-cheetah.com

It is really easy to create a USB bootable OpenBSD LiveUSB image. With that you can do just about anything you want. Don't believe me?

Then head to http://liveusb-openbsd.sourceforge.net and download the USB image. Boot it and find out!

You can watch videos with mplayer in full screen, you can bask in the glory of mplayer's sexy OSD menu, you can read manual pages in color, you can lookup English words using the dictionary client, you can chat with pidgin, you can browse with Mozilla Firefox, you can use sox to convert audio, you can play any video or audio file with mplayer, you can stream audio from the Internet, you can do whatever you want!

Moreover you can also use the rich repertoire of tiny but incredibly powerful tools like netcat, socat, nmh, mutt, vim, randtype, figlet. After all the man pages tell you how to use these tools and you have examples too. You also have ready access to the perl, python and lua interpreters, you have all the spam control daemons, the routing protocols like BGP or OSP, you have FTP server, HTTP server or you could do image processing with ImageMagick.

There is one detail however.

You have to use DHCP to connect to the Internet if your ADSL MODEM dishes out dynamic IPs or you can configure the IP using the ifconfig command. Usually this will do.

	# dhclient vr0
	(Your ethernet interface could be fxp0, rl0 or something else,
find out with ifconfig)

Give it a whirl and get in touch with me should you have any issues using this. After all it is free and open source.

And oh by the way the fixed write cycles of USB memory drives is largely a myth.

/Girish/LiveUSB.html copyright and reprint notice

Comments /Girish/LiveUSB.html

Sun Nov 15 15:30:40 2009 JonR

"And oh by the way the fixed write cycles of USB memory drives is largely a myth."

My heart leaped with joy at seeing this note. It seems virtually every article (or forum post) touching on solid-state drives, including USB devices, perpetuates the misinformatiion that these things will wear out pretty quick. (And the word "largely" is well chosen, too, because there is the _possibility_ of wearing out; but today's design and manufacture techniques have probably placed the possibility in the neighborhood of the always-present potential of a mechanical hard drive to self-destroy.) More likely than wearing out, solid-state memory is apt to outlive the average spinning drive, and possibly by a factor of years.

If people are educated to see that reputable manufacturers' solid-state drives are actually an improvement over mechanical ones, the price of them is bound to drop, as it must if they're going to get into truly widespread use. I'd gladly replace my laptop drives with solid-state if I had several thousand dollars to play with.

I'm going to try out the OpenBSD-to-go you describe here. I did make a live CD of the current OpenBSD distro a couple of months ago but was disappointed by the difficulty of getting wireless set up. I understand why the developers chose to do it that way; it's so virtually every chipset in existence will work with OpenBSD. But I wasn't willing to go through all that for a live test-drive, and without wireless I wasn't going to connect to the Internet (via Ethernet cable) due to physical logistics where I live. What I did see, I liked. I use Linux exclusively except in emergencies or dead-end situations where I simply have no choice but to fall back on MS Windows. I'd welcome a distro that's more nimble than the Linux ones are tending to be now with all the Gnome atrocities and other bloat that's dragging them down, sorry to say, steadily to the state of mediocrity -- or worse.

Sun Nov 15 22:31:52 2009 JonR/USB memory drives AndrewSmallshaw

Yes, these days wear on flash drives is pretty much a non-issue. Decent drives now have wear levelling as standard. That may not be true if you but some ultra cheap drives from the local pound shop or wherever but if you care about your data you are hardly likely to trust your data to those anyway.

Still not sure about performance though. USB is a nontrivial protocol and while it has high headline bandwidth too often it does not equate to real performance - it's kind of the same argument that makes people spend hundreds on a decent disk controller rather than the on-motherboard one with a SATA drive.

My most direct SSD experience to date has been with my "new" home server. It's a slightly odd configuration in many ways, and I could probably write a whole article on it, but it uses a compactflash card in an IDE adapter as the root filesystem to keep the main disk spun down as much as possible. It works reasonably well - read speeds are fine after a little tweaking but writes, especially random ones, are absolutely hopeless - unpacking a source code tarball often averages literally floppy disk speeds. It's only the fact that once a system is steady state largeish writes to the root fs are comparatively rare that makes it remotely usable. This affects all SSDs, not just that CF card I bought for £15 on the grounds that if it blew up it would be no trouble to replace.

Yes, moves are afoot to resolve this kind of use - the new ATA TRIM support may help things a lot by erasing empty blocks ahead of time, but that requires support that NetBSD doesn't have yet. I did briefly consider Linux but I decided not to go there for the bloat you mention. Ten years ago the Linux community had a system they were only to keen to point out was so much better than Windows, but then proceeded to try and make it more and more like Windows.

Mon Nov 16 11:49:37 2009 Stefan

Hello Girish,

while I am very excited over any new way of using OpenBSD, the actual use of this sourceforge project is still unknown to me. As far as I can tell, the user is expected to be able to install a BSD/linux system, use the package manager, download the live boot image and copy it onto the USB thumb drive.
All in all this seems a fairly complex process to me, given the target audience here.

A person, who can do all that, should also be able to read the OpenBSD documentation, that details, how to install the OS onto something different than the builtin harddrive. Especially this section: http://openbsd.org/faq/faq14.html#flashmemBoot tells the user exactly, what to do.

If you want the interested user to try OpenBSD over Windows, you may want to include some instruction on how to do all that from a Windows machine.

Beside overriding the installation method and customizing the desktop, is there anything else you do to the installation ? I am wondering, whether you actually fine tune the installation, to run more smoothly from the USB drive.



Mon Nov 16 12:02:07 2009 Michiel

I recently replaced the prehistoric harddisk in my FreeBSD system with a brand new Intel X25 Solid-state-harddisk (I love the speed and silence). I didn't take any of the usual precautions to make the operating system lighter on the disk, like using tmpfs instead of /tmp, cronjob elimination, disabling logging etc... Instead, I make a periodic full backup and I just wait and see what happens. When it fails, I'll just replace it with a new SSD. My expectation is that nothing eventful will happen anytime soon.

Add your comments




cartoon
Older