Amazon EFS

What Is Amazon Elastic File System?

Amazon Elastic File System (Amazon EFS) provides simple, scalable file storage for use with Amazon EC2. With Amazon EFS, storage capacity is elastic, growing and shrinking automatically as you add and remove files, so your applications have the storage they need, when they need it.

Amazon EFS has a simple web services interface that allows you to create and configure file systems quickly and easily. The service manages all the file storage infrastructure for you, avoiding the complexity of deploying, patching, and maintaining complex file system deployments.

Amazon EFS supports the Network File System version 4.1 (NFSv4.1) protocol, so the applications and tools that you use today work seamlessly with Amazon EFS. Multiple Amazon EC2 instances can access an Amazon EFS file system at the same time, providing a common data source for workloads and applications running on more than one instance or server.

Amazon EFS Pricing

With Amazon EFS, you pay only for the amount of file system storage you use per month. There is no minimum fee and there are no set-up charges. There are no charges for bandwidth or requests.

Pricing

US East (N. Virginia) $0.30/GB-month
US East (Ohio) $0.30/GB-month
US West (Oregon) $0.30/GB-month
EU (Ireland) $0.33/GB-month
Asia Pacific (Sydney) $0.36/GB-month

For example, assume your file system is located in the US East (N. Virginia) region, uses 100GB of storage for 15 days in March, and 250GB of storage for the final 16 days in March.

At the end of March, you would have the following usage in GB-Hours:

Total usage (GB-hours) = [100 GB x 15 days x (24 hours / day)] + [250 GB x 16 days x (24 hours / day)]

= 132,000 GB-Hours

We add up GB-Hours and convert to GB-Months to calculate monthly charges:

Total Monthly Storage Charge = 177 GB-Months x $0.30 = $53.10

 

 

Windows Servers 2012 and 2016 do not include an NFSv4 client. They do include an NFSv4.1 server, however, the client is NFS v2 or v3 only. To mount an EFS file system the client needs to support NFSv4. As a result, EFS file systems can’t be accessed from Windows Server 2012 or 2016.

We understand the importance of Windows support, and we are taking your feedback into account along with customer requests for other features.

Definition and Backup Rotation

One of the key elements of every data backup is the definition of the rotating scheme so the protection was guaranteed at least one day back. The best rotating scheme of data carriers is the one, which can guarantee data copies as long, extensive and varied as possible.

Data backup and its consecutive storage for more than one day are necessary.

Nevertheless the costs or time required for full everyday-backup can be impractical, especially for companies with huge amount of data. That’s why many users apply either differential or incremental backups on most of the workdays.

Types of Backup

Full Backup – during full back up the selected files are backed-up and their Archive attribute is removed simultaneously. Attribute is instrumental to distinguish backed-up and non backed-up data. When the file content is changed, the Archive attribute is set again. Full backups are usually a pre-step before using Incremental and Differential backups, which help to save time necessary for backup performing. If the full backup is performed, it is enough to restore only this one backup for restoration of the original state.


Incremental Backup – during this type of backup only the files with built-up Archive attribute are backed-up and this attribute is deleted afterwards. Thus only the files that have been changed (or the ones with attribute Archive set manually since the last full backup) are backed-up. The backup is significantly shorter than the full backup, that’s why it is usually used for backup during workweek. In this case the restoration of this backup is not enough for the original state restoration. In case of server or disk array breakdown it is necessary to restore the last full backup first and then all Incremental backups chronologically from the oldest one till the newest one created at the time after the last full backup. This means that incremental backups are indeed faster to create but it takes more time to restore because of the necessity to restore number of backups.  Probability of a big breakdown is not that likely so you can definitely make the best account of the incremental backup.

 

Differential Backup – during the differential backup only the files with built-up Archive attribute are backed-up and this attribute is not deleted afterwards.  Thus only the files that have been changed (or the ones with attribute Archive set manually since the last full backup) are backed-up. The backup is significantly shorter than the full backup so it is also used for backup during workweek. The restoration of this type of backup is not enough to restore system to the original state either. In case of server or disk array breakdown it is necessary to restore the last full backup first and then the last differential backup created at the time after last Full backup. This implies that the differential backups are comparably fast as the incremental concerning the time for creation; it takes less time to restore data thanks to the necessity to restore only one differential backup, while there is a dependence on how many incomplete backups have passed since the last full backup. The first day after the full backup the time for incremental and differential backup is the same; within following days the time of differential backup rises but the time for restoration relatively decreases.

If you choose the method of incremental or differential backup as an accessory to the full backup depends only on the environment of your company. In these cases it is better to have the analysis and policies proposals made by specialists so it is possible to avoid mistakes of badly designed backup policy, which can cause wrong invested resources into the storage.

After short introduction to available types of backup we can briefly describe the most widely used methods of tape rotations using the types of backup mentioned above.

Tape Rotations

Round Robin (scheme with one tape per each day)

We gain the simplest scheme of tape rotation by reservation of one tape for every day of workweek. Tapes are labelled (Monday, Tuesday, Wednesday, Thursday and Friday). Each day a full backup of data prepared for backup is made on the relevant tape. This rotation enables the data restoration with maximal time shift backwards – one week. The scheme is suitable for application in small companies with usage of internal or external tape drive or NAS device with created VDL (virtual disk library), which can serve as a primary storage. This solution is also suitable there, where it is possible to perform a full backup every day and the time shift of one week backwards is sufficient.

Grandfather-Father-Son (GFS)

The method of ‘Grandfather-Father-Son’ backup scheme belongs to the most widely used schemes. This scheme uses daily (Son), weekly (Father) and monthly (Grandfather) medium sets. Four medium sets are titled everyday backup of workweek (i.e. Monday to Thursday). On these medium sets (titled as Son in the scheme of GFS) the incremental backups take place. These medium sets (Son) are repeatedly rewritten during the next week. Another group of five medium sets, which is included in the GFS scheme, are medium sets titled as Week 1, Week 2, and so on (Father). See the picture:

On these medium sets (Father) the full backups take place every week, the Son medium sets are not used and the expiration time of ‘Father’ group is one month. Then their rewriting follows. The final medium set ‘Grandfather’ consists of 3 medium sets (medium set may be composed of one or many tapes) and it is titled as ‘Month 1, Month 2, Month 3, and so on’. On these medium sets the follow-up rewriting takes place once per three months and more (depends on how many sets are devoted to ‘Grandfather’ group. The expiration of these sets (it is the possibility of another rewriting) is adjusted according to the number of medium sets in Grandfather Group. Each ‘medium set’ of tape groups (Son, Father or Grandfather) is either individual or a set of tapes. That is dependent on the size of backed-up data. The total number of used medium sets in the GFS backup scheme is twelve. By reason of tape wear and by reason of keeping a longer history (archiving) it is recommended to change the medium sets for new ones in a certain time period.

Tower of Hanoi 

The scheme of Tower of Hanoi draws from a logical game, which has its origins in China. The objective of the game is to move five disks from one rod to another with minimal number of moves. Only one disk must be moved at a time and no disk may be placed on top of a smaller disk. It has been proven that the fewest number of moves is 31. The method of Tower of Hanoi uses five medium sets for backup:

  • Medium set A is used every other day
  • Medium set is used every fourth day
  • Medium set is used every eighth day
  • Medium sets and E  used in turns every sixteenth day

The planning of Tower of Hanoi is following:

The backup starts on medium set ‘A’ and then it continues every other day. The next backup takes place on medium set ‘B’ (but not on the day of medium set ‘A’ backup) and then it repeats every fourth backup. Medium set ‘C’ starts not on the day of ‘A’ and ‘B’ backups and repeats every eighth backup.  The policy of D and E Sets is adjusted this way. The first backup does not start on the day of A, B or C Sets backup and repeats every sixteenth backup.

The quality of this scheme is first of all the possibility to add a new medium set and to gain more history of backup this way (GFS likewise). More often used medium sets contain new file duplications, whereas the less common used medium sets contain older file versions.

This scheme is quite difficult to administrate manually. Because of that it is highly recommended to use backup software with the option of scheduling of whole process (i.e. NetVault 7.1) and especially while using of tape autoloader (i.e. Tandberg, autoloader SLR140). Or to use a more suitable solution with more slots such as tape libraries (i.e. ADIC Scalar 24, ADIC Scalar 100) for sufficient number of medium sets involving tapes for backup, archiving and disaster recovery solution. As well as the Grandfather-Father-Son scheme the Tower of Hanoi enables to periodically take out the medium set with the view of archiving.

In Fine

Nowadays backup trends bring usage of primary and secondary data storage to the backup schemes for higher safety. NAS (Network Attached Storage) works as a primary data storage in most of companies, where the workweek backups are performed on disks and then are made migrate into tape drives of tape libraries.

At the whole solution design it is good to start from the equations below for calculation of number of needed tapes enabling a safe backup, data archiving and Disaster recovery Solution. Next time we will talk about usage of primary storage built up on NAS devices (i.e. Iomega NAS p400/p800) and subsequent data migration into the secondary storage built up on a usage of backup schemes.

Calculation of number of tapes needed for backup including archiving and Disaster Recovery:

Tapes dedicated to backups
Xs = D * T * S * R + N
Xs = number of tapes needed for backup for a period of one year
D = number of backup drivers
T = number of tapes in a media set
S = number of media sets in a backup scheme
R = number of backup scheme rotations per year

Tapes dedicated to archiving
Xa = T * S * A
Xa = number of tapes needed for archiving
T = number of tapes needed for backup duplication of each server
S = number of servers
A = number of archiving sets per year

Tapes dedicated to Disaster Recovery
Xr = T * S * R
Xr = number of tapes needed for recovery
T = number of tapes needed for Disaster Recovery backup of one server (see archiving)
S = number of servers (see archiving)
R = number of required disaster recover rotations per year

Total annual consumption of data tapes
X = Xs + Xa + Xr + R
X = total number of tapes needed for a period of one year
Xs = number of tapes needed for backup for a period of one year
Xa = number of tapes needed for archiving
Xr = number of tapes needed for recovery
R = approximate number of tapes that will be necessary to replace by new ones

http://www.storage.cz/en/specialized-section/detail/id/46-definition-and-backup-rotation