EMC Avamar Global Data Deduplication for Remote Offices

This video is very old, but I love it. It describes the Avamar product and benefits very well. It features Jedidiah Yueh, who developed the product originally called Axion and sold his company Avamar to EMC.


no comments


Selecting a backup as a service partner (BaaS)

It was just a couple of years ago my company asked me to conduct a proof of concept exercise for a particular backup as a service offering. The talk of the cloud had become louder and could no longer be ignored. Management was hoping to find a solution that could create some annuity based income and would appeal to the SMB market. Backup technologies can get expensive very fast. A conventional backup solution will require servers, tape or disk targets and software. If tapes are used man hours need to be taken into account to manage as well as an offsite vault would need to be engaged. If a disk solution is used a colocation would be required for DR with replication. It adds up fast and IT budgets rarely take the importance of backups and DR into account in planning. That’s why BaaS is a great option for SMB. For a monthly fee all your backups can be taken care of. Sounds great right? It defiantly has gotten better. Just 2-3 years ago there were really not a lot of players in the market. There were the home consumer products that provided plans for business, but not the intelligence required to properly protect databases and applications. At best these products as well as the service I tested provided a crash consistent backup. That is the robustness of the product is relied on to protect itself. If you yanked the power cable out of the wall on your exchange server, would you be concerned if the database would be mountable at startup? You should be. Most likely you’ll be OK, but you are not %100 protected. Would you want to pay somebody a monthly fee to maybe  protect your applications or databases? Now just a few years later there are many options out there that claim to provide business intelligent backups for this data. I say claim because I have not had the opportunity to test any myself since the POC I had done a few years ago. So with so many options out there, how do you choose a BaaS provider? Here are some things to consider.


Where will your data be?

It’s important to know where your data will finally actually be sitting. Is it in the data center of a known trusted provider or the CEO’s basement? A site inspection should be included as part of the due diligence, which brings me to my next point.


Consider geographically where your data will be.

Will your data be crossing any physical borders into another country? If so you may obviously want to avoid a company that is shipping your data to geopolitically unstable nations, not that any are that I know of.  Most likely your data may be crossing the US/Canada border and your company may or may not have any such concerns, but this should still be taken into account.


It’s good to ask a lot of questions about the technology and ensure you have a solid understanding of how it works. A good BaaS solution should use data deduplication and it should be done on the client side. This is required to reduce the amount of data required to be moved. As well compression should factor into the solution and encryption if required. Is any of your company data sitting encrypted on disk? How will this dedupe? A trial period should be engaged and backups and restores should be tested to ensure the solution can meet your expectations.



Most likely the service will be leveraging some kind of compression or deduplication technology to limit the amount of data required to be moved. The question is can they estimate the length of time required to complete the initial level 0 backup? Can you simultaneously run your existing backup solution during this level 0? Depending on the amount of data it could take weeks to months to complete a level 0 of all your data. Also consider when your billing will start. Will it start on the day the first level 0 begins or when the first level 0 completes? A good strategy would be to break up the company’s data profile into chunks and using known metrics estimate the amount of time it would take to complete a level 0 of each chunk and incrementally bill from there.


Server/network bandwidth cost?

Another question is if there are any throttling options?  It’s important to understand the impact the backup will have on network and server resources as the initial level 0 backup may take weeks to complete, which brings us to our next consideration.


Are there any seeding or shipping options?

Quick recovery of an employee’s spreadsheet over the WAN isn’t an issue, but what about a larger dataset? Some companies provide a service where a backup would be completed to a local portable NAS and then shipped to the service provider to seed the initial level 0 backup in hopes of completing the backup faster. This option would be required if your company has more than 5 TB of data. Conversely what if you needed to restore all 5 TB of data in the event of a disaster? Recovery over WAN would be inefficient to say the least, so could they restore and ship the data on disk? This is an important consideration and could be mean the difference between a faster recovery and going out of business in the event of a disaster.

no comments


Google Drive


I first heard the rumors about google drive just yesterday. Cut me some slack, I’m still rebooting from vacation. I’ve been a Dropbox user for a year and I love it.  The rumor of a free 5 GB basic sounded great, but I was even more impressed with the 25 GB for only $2.50 a month. So I took a look at it today and here are some quick observations on the differences between Dropbox and how this will fit into the Google toolbox.

I went to Google drive on my Ipad and was surprised to see my google docs already there. The revision history and collaborative potential looks great. I could not do much else on my IOS device. A real IOS app will be coming soon. Expect some google IOS app updates to feature some added integration with the G drive. I would love to see a G Drive option in Good Reader! Companies already invested in Google Docs and Gmail may be tempted to further venture into Google Plus to engage an online collaborative work environment in the cloud?

When I got to a PC I downloaded the Google Drive app. Like Dropbox an icon is created in the system tray and you define the local storage of your google docs sync, which it does very quickly or appeared to. What is actually being stored is a link to the actual document in the Google cloud. Nice but this does not provide any offline functionality. There is offline option that requires a change to the settings as well as Google chrome and an additional Chrome plugin. Google is of course promoting its search as a key differentiator and the ability to search for text in docs as well as images. This is a nice feature that I have leveraged Evernote for in the past.

There a a lot of other competitors out there that I have never tried like SkyDrive and SugarSync. Some brief research found that Google Drive has the largest and most expensive premium option of 16 TB for $800 a month as well as the largest file size limit of 10GB. The cost of G drive wins over almost all the competition, except for SkyDrive that does work out less expensive over the year. Being the data storage junkie I am



no comments


How to get CrashPlan to back up to a network drive (NAS)

I know it appears this blog is exclusively focused on backup technologies for the enterprise, specifically related to EMC products. That is my primary area of focus currently and I’m using this blog as a repository of my ongoing learning’s in this area. I would not be a very good backup expert if I did not protect my data at home. In the past I’ve used Mozy, Carbonite, Mainland’s mCloud (for important files only) and now Crash Plan to provide an offsite copy of my data. These products are great for quick recovery of files and some have an option to deliver data on external media of  large restores for a price.

I recently bought a Seagate Goflex NAS. We use it for primary storage of some media we download. All other important documents, pictures and music are on an internal 2TB drive. I started using Crash Plan a few months ago. I really like the ability to perform backups to multiple targets. My plan was to perform a backup to the Crash Plan cloud and to also create a copy on the NAS for quick recovery. The GoFlex like many home NAS devices leverages some dark arts to make themselve available on the network. It’s NIX based and I found many guides online regarding hacking them, getting root then leveraging samba. Also a great way to void your warranty and a lot of work for me as I’m not that smart.

The Goflex NAS creates some drive mappings, but the Crash Plan app does not allow backups to a network drive mapping. Instead I used windows 7 native VHD function to create what appears to be  a local drive that Crash Plan can use, but is actually a file sitting on my NAS. Here is how.


Go To computer management and right click on disk management and select create vhd.


Browse to the destination to store the vhd file. This should be on your NAS device. Select the desired dynamic or fixed disk increment or static setting. If fixed ensure it is large enough to ingest the backup data.


In a moment you should see the new disk. Right click and initialize.


Right Click on the volume again and select new simple volume. and assign a drive letter.


Then go to Crash Plan >Destinations and select the folders tab.Browse to the new drive letter and select start backup!

Wow! Was that ever fun. Super glad I spent my evening at home configuring backups. Can’t wait until tomorrow to go to work…please kill me.


1 comment


Removing devices quickly with NSRAdmin


So when I first started working with NetWorker with this particular client I ran into some issues. I can’t recall exactly what happened but there was some mismatch with the devices. In my experience from the NetBackup world, it is always best and usually easiest to delete and readd the devices. Just let the app scan and find what it will. As long as the OS can see the devices the app will also.

This NW env at the time was 2 years old, built and then configured by staff that did not know much more about NetWorker than I did at the time. I do know one thing. Keep your media pools to a minimum. I learned this years ago to ensure you get the best utilization out of your tape media.

At this time I was not well versed in the NetWorker command line or nsradmin. I attempted to delete the jukebox and associated tape devices. Quickly realized that the devices could  not be removed until the tape devices are removed from the media pools. OK… No problem….There are 95 media pools!!! O_o. Wow. Lets not talk about the possible rational. I was surprised EMC support could not advise on a quicker way to remove the devices from the pool rather than manually deselecting from the GUI. I was doing some research today and I found this.



If you want to delete all the jukebox definitions, and all of the
devices use nsradmin cli…

# nsradmin
> . type: NSR jukebox
> show name
> print

(this will list all the jukebox definitions, If you are happy to delete
them, continue)

> delete
> delete

(you have to run delete twice – maybe more… Just keep running delete
until no records are found)

> . type: NSR pool
> show name; devices

(if any devices were owned by pools that would prevent you from deleting
them, update devices to be blank if so )

> update devices:

> . type: NSR device; media family: tape
> show name
> print

( make sure its selecting just those devices you want to delete)

> delete
> delete

Wondering if anyone has tried this. I have since widdled down the number of media pools greatly, but I am planning a upgrade soon that will require removing and readding the library. I’ll give it a try and let you guys know how it worked.

no comments


EMC World 2012

So I was able to talk my company into sending me to EMC World this year. I’m grateful for the opportunity and look forward  to bringing back some learning’s and ideas that I can implement and benefit my clients here in Calgary. The EMC World online coverage is pretty comprehensive, but I’m wonderiong if anyone has any specific questions they would like EMC and their experts to answer? Let me know and I will get the answers for you. Maybe there is a feature you would like to see in Avamar or NetWorker? Post your question in the comments or hit me up at backupbuddha on twitter.

Also, if any past attendees have any advice they can offer to ensure I get the most out of this opportunity, that would be great.

no comments


VTL weirdness

I have an EMC VTL in my backup env. It was purchased just before EMC acquired DataDomain. Let’s just say there is a reason why EMC purchased DD. It really is the superior product. That being said this VTL 4160 has served us well and aside from one major outage, has been very stable. The problem I’m going to describe probably has more to do with NetWorker than the VTL.

I had noticed my clone jobs had been hanging. The job was looking for a particular virtual tape to copy to physical. The tape was in a virtual drive and the associated message indicated reading, done. NetWorker was repeatedly trying to unload this tape unsuccessfully. As you probably know, it is a bad idea to use anything other than NetWorker to move tapes around. So if you need to, you MUST perform an inventory so NetWorker is aware of the location of the tape or tapes.  So to clear this up I first manually kicked the tape out of the virtual drive via the CDL gui. After I performed an inventory of the specific slot I put the tape in and the drive that NetWorker THOUGHT had the tape.

When the inventory was completed and NetWorker was aware of the location of all tapes in the home slots, I restarted NetWorker. Viola!

Note: Be careful with the options for  inventory. You can accidently inventory the entire library and have it load and read all tapes. That may take some time depending on the size of your library

no comments


nsradmin is your friend

I have a confession to make. I like the GUI. Don’t hold it against me. I know most of us in Backup Recovery and the products we support have roots in UNIX. I am well aware of how superior the command line can be and generally is. That’s what this post is about.  I find sometimes  the NetWorker GUI can be finicky about what it will and wont let you do, even though there may be a menu option for it?  So this week I noticed some incorrect devices hanging around. I had taken a vacation a while ago and it looks like somebody was having some fun. Grrr!

Do you think I can right click and delete? Nope!  So I configure the device as stand alone, again I try to delete it. Nada! Denied! So what to do? Thanks to EMC support for showing me this sometime ago. To the nsradmin utility!

Use tab to move through the menu. First choose NSR device from the select menu, then use next to move through the device list.

When you find you device in question select delete.


Boom! Take that careless co-worker with no respect for others backup env!



no comments


Avamar Garbage Collection Runtime


Avamar decides how long to run the garbage collect by averaging out the past history. This is problematic if you have a lot of data to cleanup and remove. The command below allows a variable to set how long the GC to run.


dpnctl stop maint —> stop maintenance scheduler

dpnctl status   —> confirm maint scheduler is stopped

avmaint garbagecollect –ava –kill=0 –maxpass=0 –refcheck=true –throttlelevel=0 –usehistory=false –maxtime=3600

dpnctl start maint —> start maintenance scheduler

dpnctl status   —> confirm maint scheduler is started




no comments


NetWorker Command Line Restores

Typically, I launch client restores using the client NetWorker GUI on the source client. I think this is a best practice as the source restore client should by default have all the required access and permissions required to its own data. However a few weeks ago a received a restore request for a client that is firewalled. We initially attempted the restore from the client, but ran into the following error.

Recovering files of client ‘cwf###’ from server ‘cls###’ to client cwf###’.
Recovering 1 file from D:\downstream\backup\infosys_exp_tosite\ into C:\temp
Total estimated disk space needed for recover is 23 KB.
Requesting 1 file(s), this may take a while…
Requesting 1 recover session(s) from server.
53363:winworkr: Recover of rsid 993513722 failed: Error receiving files from NSR server `cls##’
73724:winworkr: One or more recover threads failed to exit successfully
73731:winworkr: Parent recover thread exited with errors
52973:winworkr: Didn’t recover requested file C:\temp\EM700299_7002_infosys.2769
Received 0 file(s) from NSR server `cls###’
Recover completion time: 2/7/2012 10:34:10 AM

I initially assumed this may be related to the firewall. I decided instead to perform a redirected restore to one of my NetWorker storage nodes and get the restore done. My plan was to later examine the firewall setting on NetWorker and see if there was a problem there.

But! The restore also failed with the same error on the redirected restore? As stated this is the first time I had attempted a redirect in this backup env, so what was the problem? Well that is still pending, but I wanted to outline what we did do with the help of EMC support to get the restore done.

First we identified via the winworker GUI if the files where there and what date they were backed up and get the saveset ID

[root@cls### ~]# mminfo -avot -q “client=cwf###,savetime>01/27/12” -r name,savetime,ssid,level | more
name date ssid lvl
VSS USER DATA:\ 01/27/2012 253990450 full
D:\ 01/27/2012 3861092125 6
C:\ 01/27/2012 3777206061 6
VSS OTHER:\ 01/27/2012 3340998683 full
VSS SYSTEM BOOT:\ 01/27/2012 3156449364 full

With the saveset ID in hand we ran the the following mminfo command to confirm the required tapes.

mminfo -av -q ssid=69700782 -r name,sumsize,level,ssflags,savetime,volume
 name                             size  lvl ssflags date   volume
D:\                              11 GB incr vF   01/30/2012 001756
D:\                              11 GB incr vF   01/30/2012 VT0605

When confirmed the tapes were on hand we launched the restore via the following.

C:\Program Files\Legato\nsr\logs>recover -vv -d c:\temp\dan -s cls### -S 38610921252 -a D:\downstream\backup\dailytrans\EA111299_dailytrans.2086
Recovering a subset of 166881 files within D:\ into C:\temp\dan
Requesting 1 recover session(s) from server.
recovering 166881 save set(s) on rsid 993552684
asm -r C:\temp\dan\downstream\backup\dailytrans\
asm -r C:\temp\dan\downstream\backup\
asm -r C:\temp\dan\downstream\
Received 3 matching file(s) from NSR server `cls213′
Recover completion time: 2/22/2012 2:57:58 PM


I was glad to get this restore done and have the opportunity to familiarize myself a little more with the NetWorker command line. I’m discovering, like NetBackup that the command line cannot be ignored when performing daily operations of this product.

no comments

Back to top