Category Archives: Computer Enthusiasm

Network File Sharing

What are the ramifications of using NAS (Network Attached Storage) instead of DAS?  Will I want to work directly off of it, or keep work locally and only use it for backing up?  The answer will vary with the nature of the work, I think.

File Copy

Let’s start with simple file copying.  This was performed using the command line, not the File Explorer GUI.

Large files

17 files totaling 273.6 GiBytes

Windows (CIFS) share49min 41sec94 MiBytes/second
NFS share
different drive on same machine48min 2sec97 MiBytes/second

Small files

14,403 files in 1,582 directories totaling 2.53 GiBytes

CIFS share64min 53sec683 KiiBytes/second
same drive3h 25min216 KiBytes/second
different drive on same machine56min 50sec780 KiBytes/second

For large files, the transfer rate of 94 MiBytes/second (or 98.6 MBytes/second) is respectable.  Everything I could find on real-world speed of Gigabit Ethernet is outdated, with home PCs being limited by the hard drive speed.  Note that I’m going through two cheap switches between the test machines.

The small-file case is two orders of magnitude slower!  This bears out the common wisdom that it’s faster to zip up a large collection of files first, and then transfer the zip file over the network, and unzip on the receiving side.

I think that the speed is limited by the individual remote file open/close operations, which are slow on Windows, and the network ads latency if this is a synchronous operation.  The DAS (different drive on the same computer) is only 14% faster then the NAS in this case.  The data transfer time of the file content is only 0.7% of the time involved.  The real limiting factor seems to be the ~4 files or directories processed per second.  That does not sound at all realistic as I’ve seen programs process many more files than that.  There must be some quantity after which it slows down to the observed rate.  Since it is similar for DAS and NAS, It must be a Windows problem.  I’ll have to arrange some tests using other operating systems, later.

Working with Files

Compiler

This is what I do all day.  What are the ramifications of keeping my work on the NAS, as compared with other options?

Compile Build Job

Project located on local drive. (A regular directory or a VHD makes no difference)4min 9sec
Project located on NAS, accessed via CIFS (normal Windows share)10min 29sec
using NFS share insteadN/A

The Microsoft Visual Studio project reads a few thousand files, writes around 1500, reads those again, and writes some more.  When the project is located on a local drive, the CPU usage reads 100% most of the time, indicating that the job is CPU bound.

When the project is located on the NAS, the situation is quite different.  Given that the actual work performed is the same, the difference in time is due to file I/O.  And the extra I/O takes more time than the job did originally; that is, the time more than doubled.  It was observed that the CPU utilization was not maxed out in this case.  The file I/O dominated.

The same job was performed again immediately afterwards, giving the computer a chance to use cached file data it had already read recently.  That made no difference to the time.  It appears that with Windows (CIFS) shares, even on Windows 7 (the sharing protocol was significantly reworked as of Vista), file data is not cached in memory but is re-read each time it is needed.  That, or the “lots of small files speed limit”, or both, kills performance.

I tried to repeat that using a NFS share instead of the CIFS share.  However, I could not get it to work at all.  The Windows machine could see the file names and navigate the directories, but could not read any file.

Video Encoding

Encoding a video entailed reading one very large file and writing one smaller file.  The process’s performance metrics indicate reading only 2.5MB/s and writing merely 80KB/s.  I would not expect it to matter if the input file, output file, or both were on the NAS.

Likewise, video editing and programs like Photoshop will read in the files and maintain the contents in memory or manage its own overflow swap space (which you put on a local drive).  It’s harder to do actual timing here, but the impression is that various programs are perfectly responsive when the file are directly attached.  If that changes when using the NAS instead, I’ll note the circumstances.

Caveat

All of the performance characteristics above are made with the assumption that the storage unit and the network links are all mine for the duration of the test.  If multiple people and pets in the household are using the NAS, you have the added issue of having to divide up the performance among the simultaneous users.

Note that FreeNAS does support link aggregation, so I could plug in two gigabit Ethernet cables if I replaced the switch with one that also understood aggregation.

I need a (home made) NAS!

I ran out of space on my RAID-5 drive, which I built several years ago.  At the time, 1TB drives were as large as you could get before the price increased disproportionately to the capacity.  I bought 2 “enterprise” grade drives for around $250 each, and a consumer drive for half that.  The usable capacity is 2TB because of the redundancy.

I decided I was not going to lose data ever again.  Having redundancy against drive failure is one component of this.  So, all my photos and system backups are stored there, along with anything else I want to keep safe, including Virtual Machine images.

It turns out that a lot of the space is taken by the daily backups.  Even with a plan that uses occasional full backups and mostly incremental backups, they just keep on coming.  I also need to better tune the quota for each backup task, but with multiple tasks on multiple machines there is no feature to coordinate the quotas across everything.

Meanwhile, a new drive I installed had a capacity of 3TB by itself.  Lots of room for intermediate files and things I don’t need to keep safe.  But that’s more to be backed up!

Now I could simply replace the drives with larger ones, using the same Directly Attached controller and chassis space.  But there are reasons for looking at Network Attached storage.  They even have Drobo NAS units at WalMart now, so it must be quite the mainstream thing now.

Besides future upgradability to more drives (rather than just replace the drives with larger ones again), and better compatibility with different operating systems used in the home, and specialized media services for “smart” TVs and tablets, a compelling reason for me is the reason I’m using a RAID-5 in the first place:  to preserve my data.  As I’ve noted elsewhere, silent data loss is a bigger problem than generally realized, and it is actually growing.  Individual files somehow go bad, and are never noticed until long after you have reused backups from that period.

Direct-Attached storage — simply having the drive on my main PC — limits the choice of file systems.  In particular, Windows 7 doesn’t have full support for anything other than NTFS and various varieties of FAT, and new more advanced file systems are only available on a few operating systems as they are not widely used yet.

A file system that specifically keeps data integrity and guards against silent errors is ZFS.  When I first learned about it, it was only available for BSD-family operating systems.  A NAS appliance installation (using FreeBSD) called FreeNAS started up in 2005.  More generally, someone could run a FreeBSD or Linux system that has drives attached that are using ZFS or btrfs or whatever special thing is needed, and put that box on the network.

As I write this, a Drobo 5N (without disks) sells for $550.  It reportedly uses a multi-core ARM CPU, and is still underpowered according to reviews.  Most 2-disk systems seem to use one of two ARM-based SoC that costs about $30.  Now you could put something like the Addonics RAID-5 SATA port multiplier ($62) on that to control more disks at a low price.  Most 5-disk home/SOHO NAS systems seem to be based on x86 Atom boards.

Anyway, if you used hand-me-down hardware, such as the previous desktop PC you just replaced with a newer model, you’d have a much more powerful platform for free.   Buying a modest PC motherboard, CPU, and RAM for the purpose (supposing you had a case and power supply laying around) could be found for … (perusing the NewEgg website for current prices) … $225.

So basically, if you know what you’re doing (or can hire someone to do it for you for a hundred dollars) you can get hardware substantially more powerful for a fraction of the price.

Being an enthusiast who’s never bought a pre-made desktop PC, it’s a no-brainer for me to put something together from parts, even if it only had the features these home/SOHO appliances that have become so common, even if I don’t re-use anything I have on hand.

But, none of the NAS boxes I see advertised discuss anything like the silent data corruption problem.  They don’t say what kind of file system is being used, or how the drives might be mounted on a different system in the event of a board (not drive) failure if a replacement for the exact model is no longer available.  I would think that if a NAS had advanced data integrity features then it would feature prominently in the advertising.  So, build I must, to meet the requirements.

In future posts I’ll discuss the silent corruption problem at more length, and of course show what I actually built.  (I’ve named the NAS server OORT, by the way.)

 

 

 

 

Happiness is having a good backup

I’ve my share of hard drive failures and software accidents, and more often than not I’ve been able to recover.  Here are my current back-up provisions:

  • Daily backups using Acronis
  • Windows 7 “Previous Versions” feature
  • Backup copies of important files on different drives
  • Backup copies of important files on different machines at home
  • Annual off-site backup of entire machine
  • Important files stored using RAID-5

The technology of back-up media has changed over the years.  Once upon a time I used a stack of 50–100 5¼″ floppy disks!  I also remember when I went to DAT tape, which could hold a full backup and incremental backups every day for a month on one 400 MB cartridge.  Eventually came recordable CDs, and later DVD-RAM.  Along the way were 20MB “floptical” disks, Jaz drives, and Zip discs.

Today, the best backup medium is another hard drive.  A cheap on-sale hard drive has a better price per gigabyte than optical media of reputable quality, and nothing else is even close.  Desktop HDDs are also quite robust—I’ve heard of data recovery companies reading a hard drive after a house fire had destroyed the computer.  Plastic optical media would be toast!

Partition vs File

There are two fundamentally different kinds of backup.  For typical data applications, “documents” are files and can be easily copied elsewhere for safe keeping.  Any manner of copying a file (and copying it back where it came from) will serve to back up a word processing or any kind of office document, photo, video, etc.

But the “system” is different.  The operating system and the arrangement of installed programs has files you don’t understand, and even special things in special places on the hard drive.  The way to back that up is to make an exact sector-by-sector image of the partition.  This requires specialized software both to make and restore.

That is also one reason why I still keep my data separate from the system.  My C: drive is for the operating system and installed programs, and my files are on a different partition (say, E: and G:).  On Windows this means ignoring the prepared My Documents locations or taking steps to point that to another partition.

It has definite advantages, and I’ve made good use of it recently.  When updating some program caused problems, I simply restored to the previous day’s system backup of the entire C: drive.  My work, which was on E:, was not affected.  Had my work been on C: also, this step would have erased my efforts that were performed since that backup point.

Multiple Methods

Besides using different tools for the System backup and your day-to-day work files, you can use a variety of different overlapping techniques all at the same time.  You don’t have to use one tool or another.  You can use a 3rd party backup suite and casually replicate your work to your spouse’s computer.  Even with a single too, you can have automated daily incremental backups to another drive and make monthly full backups to Blu-ray to store off-site.

Automatically and Frequent

I used to boot the computer specially in order to do a complete partition back up of the normal C: drive.  I would do so before making significant changes, and was supposed to do so once a month regardless.  But it was a chore and a bother.

Now, Windows can reliably back up the running C: drive using an operating system called Volume Shadowing.  Being able to perform the backup while running the regular system is liberating, because it can be done automatically on a timer, and it can be done in the background.  So I have Acronis True Image perform daily backups of the C: drive.

Likewise, the same technology applies to data files.  Even if I happened to be still working at the odd hour at which I scheduled the daily file backup, using the files would not conflict with the backing up.

Windows Previous Versions feature

Windows 7 has a feature called Previous Versions that can be handy.  You can turn on System Protection and also enable it for your data drive.  Use the System control panel applet, and there is a tab for System Protection.

Windows 8 File History

This is deprecated but still available on Windows 8.  Windows 8 revamps the general idea with something that’s said to be more like Mac’s Time Machine.  It backs up to an external drive or network location, and it is hourly (or customizable interval).

Search for file history on the Windows 8 Start screen to get to the applet.  However, there seems no way to specify which files are being backed up!  It only and always applies to places that are part of a Library.  So I worked-around it by adding the directories of interest to a Library in File Explorer.

Windows Restore

I tried (on Windows 7) using its supplied System Backup feature, and was less than trilled with it.  It backs up to a hidden directory on the same drive, I don’t know what it does about having multiple versions stored there.  And I can’t simply copy the backup file elsewhere.  It’s actually the same feature that the Previous Versions uses, so I imagine it’s also better on Windows 8.

Drill!  Be confidant

Make sure you know how to restore files, and that it actually works.  When an urgent deadline coincides with a messed up file, that is not the time to be figuring out an unfamiliar system.

So, after you initiate your automated backup system for work files, also create one or more scratch files of the same kind you normally work with.  A silly word processor document containing a stupid joke, perhaps.  After a couple days, when the automated system has had time to do its thing, delete the file.

Now, get it back.

Make notes, and keep them on actual paper, to refer to when this is not a drill.

Then, you can be happy.  Be smug even, especially when someone else has “an incident”.

 

How do Phishing scammers get your personal information?

Today I got email that pretended to come from Ebay, in the form of a fake invoice that is actually bait to get you to click on one of the links in the message.  This is known as “phishing”, as explained on Wikipedia.

Now this particular message was sent to the wrong email address.  I use a unique email address for each online merchant or other purpose such as forums and any other kind of sign-up.  The particular service I use, and have been happy with for many years, is https://www.sneakemail.com.  It is handy for me to keep track of order information and forum sign-up data too, for low-to-medium security purposes (I keep passwords for banking sites and such in a password vault).

So, when I got this scam email, I knew that it was not really sent from PayPal.  It was sent to the address I used for Things From Another World, “best online store to buy comics, graphic novels, manga, and pop-culture collectibles!” and apparently to have your customer information stolen, too.  I used this email address on an order made January 3, 2006, for the Serenity comics, in case you’re interested.  That’s just to point out how Sneakemail helps me track these things.

So now their customer database winds up in the hands of criminals.

This is not the first time it has happened.  Other companies have been caught at supplying email address and perhaps first/last name (and who knows what else?) to those who then send spam or phishing email.  Most of the time they totally ignore repeated inquiries to their customer service, support, help, or other email addresses.

But when I have gotten an answer (e.g. from dyndns), it usually turns out to be blamed on the company they use for their newsletters, and they promise that it only included non-sensitive information.  So, that’s another reason to be sure to un-check any kind of newsletter subscription that they usually have on the check-out page.

Now, with Sneakemail, I can activate greylisting on an individual address, set up filtering (which is handy for addresses used for mailing lists and forums) to only allow through the intended correspondent, and, when necessary, disable or delete that individual address.  Deleting the address I used to place orders with TFAW or Oratec, does not affect any other address so all my other correspondence is not bothered.

Update May 11

I received a reply from someone at TFAW.com dated Friday afternoon.  That’s about a 24-hour turn-around, which is remarkable in these cases.

She said, (bold mine)

Thank you for contacting us, and notifying us of this matter.  We definitely do not rent or sell out any customer information at all, and any information provided on our site is kept completely confidential.  We do list our privacy policy confirming this in our site’s help pages that you can review at the following location: http://www.tfaw.com/Help/Privacy-Policy___35   We had our technical team look into this matter and have confirmed that there have not been any compromises in security on our end.  We definitely understand your concern on this matter, but rest assured no personal information has been passed along or obtained from our site.  We here at TFAW.com take privacy concerns very seriously, and actively ensure that all information is kept safe and confidential.

And also invited me to forward the message with the headers, for them to keep on file (not to further analyze?)

If I read that correctly, they didn’t give information to anyone such as a mailing list company, and nobody ever accessed their data surreptitiously (meaning their detection would be flawless even if the access control isn’t).  So what’s left? Deliberate access by someone on the inside.  Somehow I don’t think that’s what she meant.  Maybe email was gathered in-flight from their outgoing confirmation mail (the only time that address ever appeared in an email before the spam) only to be held for a couple years before being used for spam.

If some third party is listening in on email transit, I think there would be worse effects than just use of the address much later: such a person would have the receipt, invoice, and whatnot, containing order numbers and account information and could immediately spoof that person at that store, read the mail sent for a password reset, and go nuts.  However, the current state of security on email sent between parties on normal ISPs is far from tight.

July 28, 2014 — The Sock Company

I got another PayPal phishing message, this one sent to the email address I used with Thorlos socks.  I like their socks very much, and my notes indicate that it’s cheaper to order from them directly because of free shipping, unless the order is more than $55 in which case it’s better to by from The Sock Company.  I’m sure prices have changed since I first ordered from them in 2005, but that is an illustration of the kind of records I keep and why I’m confidant that nobody else would know of the email address to which I’m receiving these messages.

September 4, 2014 — Kingwin

Another occurrence, this time from Kingwin.  I emailed their tech support two years ago with questions about their USB SATA Dock products.  At least this time there’s no customer information with them, but only my name and a (unique) email address.

June 18, 2014 — LightRocket

I got an ad promoting something called LightRocket, sent to my historical original email address.  What I mean by that is once upon a time the Web was a nicer place and my email address was published on my web pages and various publications.  I still maintain it, but it’s only used by “cold calls”.

When I wrote back asking how I got on their mailing list, I got a real reply from someone in short order.  That’s nice, and hopefully I’ll find out something.  I’ll revise this report as I learn more.

January 4, 2015 — OEM PC World

I’ve ordered flash media from time to time from OEM PC World, most recently in July 2013.  Now I’m getting dozens of emails for mail order brides.  I’m sure that’s not really in their catalog, so how did these purported women get my contact info from them?  Interesting that a company that’s been “the world’s memory value leader” for over 15 years doesn’t have an email address itself, but can only be reached via a web form.

I received a real reply from someone later that same day.

Ongoing…

 June 2015— Paradigm Speakers tech support

I wrote Paradigm Speakers support email on 23 June 2014, and one year later I started getting PayPal and Apple ID Phishing email.  It wasn’t until November that I noticed some leaking through my normal spam filter, but I see it goes back at least to June 3.

In December 2016, I’m still getting junk from them (56 over the last 30 days), with nothing resolved.

December 4, 2016 — IcyDock

I got a PayPal Phishing email that slipped through my filters, that was sent to an email address used for product registration for Icy Dock, on November 21 2011.  Is this just the first to slip past the filters?  I checked the sneakemail stats and it was the only email to that address in the last 30 days.  So, the security breach of customer info is recent perhaps.

I contacted Icy Dock Sales, and quickly got a serious reply from a representative.  It’s refreshing to see that a company not only reads and responds to their own email, but gives a serious reply rather than some canned blather or blanket denials.  So, Kudos to them!

It would be great to discover if some particular 3rd party service were responsible for many of these incidents.  It would be possible if companies took it seriously and noted who was given customer information and when.  A culprit would show up as being common to many of them.

 

 

Input

When you work with the computer all day, the human-machine interface is of critical importance.  I spend all day typing, so the workstation should optimize my ability to work rather than getting in the way.  So it is well worth putting some attention into getting right.

Here is a photograph of the “input” portion of my work desk:

kb-overview

Main Keyboard

It’s mostly all black, so it is hard to see in the photograph I’m afraid.  And I do mean all black—the keys are solid black with no labels.  Visitors are surprised by that, and I point out, “neither does a piano.”  This is a Das Keyboard II, featuring Cherry MX mechanical switches.  It is a little different from the third iteration that was introduced in 2008, in that it does not have an internal USB hub, has a matte surface, and is a fully rectangular case, something that I make use of.

Using an unlabeled keyboard does indeed make one a better typist.  The first time I learned to type was on a mechanical typewriter that had belonged to my grandfather.  The buttons were small and round with nice gaps between them to catch your finger instead!  The keys needed to be depressed a long distance and with great force.

I wasn’t that great at touch-typing and had to look at the keyboard.  It was once I became a professional computer programmer (late ′80′s) that I decided to better learn to touch type without looking.  I used Typing Tutor software, and learned the normal letter area quite well.  Typing forum posts and other prose writing, I easily made 60wpm.  But, I never really learned to “touch type” the funny characters like {} and &.  They are rare in prose, but bread-and-butter for programming languages.

I’d always appreciated a good quality keyboard.  Typing confidently with speed means not having the keys slide from side to side, and having a better feel than a cheapo keyboard.  When PC computers started getting cheap (shoddy!) keyboards, I found the Keytronics KB-101Pro, which I wore the labels off and started to wear down the plastic on some of the keys to a noticeable extent.  Eventually common keyboards improved in quality and an inexpensive keyboard from a stock PC was not too bad, and an inexpensive after-market keyboard was only subtly inferior to a high-end one.

A high-quality keyboard that was unlabeled made sense.  The labels wear off anyway, and you are not supposed to look while you type.  I recall someone improving upon the normal “don’t look” instructions for learning how to type well by using a box that hid it from view.

After I got this one, I had to learn all those funny keys that were not covered under lessons.  If I hit the wrong button, I could not cheat by looking at the labels.  I really had to learn them.  For letters and most other keys, I didn’t notice the difference because I never needed to look anyway.

Today, the only keys I ever worry about are the three to the right of the F-keys.  They are so rarely used at all, and never for their original labeled purpose anyway.  (Now I use them for different varieties of screen capturing.)

Keyboard Extension

kb-closeup

In this close-up you can see that there is another row of keys above the normal top row on the keyboard, and these are white.  Actually, they are hand-written labels under a clear flat cap.  This is a 16-key X-Keys stick.

The idea is pretty simple:  16 buttons, and USB.  But the software it came with was utterly useless.  The MacroWorks software could only issue codes for keys that you could already type!  I guess that’s for sending whole words with a single button, but I was specifically wanting it for characters that were not already on the keyboard.  I tried to hack the saved file format, but the software didn’t like the codes anyway.  I tried to use the plain USB keyboard mode and program it with codes that are defined but not on my regular keyboard (there are actually a bunch more F-keys than 12, and a few non-US keys) and then use other software to map those to what I really wanted; but that didn’t work either (I don’t know if it was the XP operating system or more funny business with X-keys).

Almost ready to give up, they pointed me to the Developers Kit.  Using the DLL they provide, I obtain the keypress.  Then it is trivial to call the Windows API function SendInput with the Unicode character.  Their original MacroWorks software installed a bunch of device drivers for a fake keyboard and fake mouse.  I asked why they didn’t just use SendInput — Win95 compatibility or some other features?  They never answered, but the next version of MacroWorks did not install a bunch of drivers.

My motivation for getting a keyboard extension was writing a chapter of text full of things like “1.23×108 kg∙m∙s−2

Trackball

To the right of the keyboard is a Kensington trackball.  It nicely matches the black motif, but that’s simply the color they sell it in.  Years ago, I realized that reaching for the mouse was not at all ergonomic.  Having piles of stuff on the desk doesn’t help either!  So I tried trackballs.  Many are too poorly made and are useless.  I actually liked the Microsoft model, but eventually wore down the ridges on the scroll wheel and they stopped making it so I could not get another of the same.  Other “ergonomic” trackballs I’ve tried make the mistake of thinking that one size fits all.  When the placement is for hands three sizes too small, it becomes the exact opposite of ergonomic.

Kensington has made professional trackballs since before mice became mainstream.  And their classic model is hard to improve upon for the actual ball.  Some day I’ll make a mod to add mouse buttons on the left side of the keyboard.

Shuttle

To the left of the keyboard is another gizmo, a ShuttlePro2.  This was recommended for use with some software I was using to edit video.  Since it can be programmed to issue keypresses for the jog and shuttle actions, it can be made to work with other software that doesn’t specifically know about it too, if you watch out for where the cursor is or what control is activated.  It’s basically a must-have for video editing and quite helpful for audio too.  For other programs it’s a bunch of extra buttons too!

Wacom  Tablet (not pictured)

For photo editing, I use a 9×12 inch Wacom tablet.  I got it around the year 2000, so I’m worried that they will stop supporting it although it works fine and there is no real reason to get a newer one.  What the trackball can’t do well, a mouse isn’t very good either:  try writing your name in a painting program with a mouse.  The stylus is the tool to use.