Creating Virtual File System in .NET
In this article we will describe how to create a virtual file system in .NET with basic functionality such as on-demand folder content listing (population), on-demand file content hydration, offline files support and client-to-server synchronization.
All programming interfaces covered in this article are cross-platform and are supported on both Windows and macOS. The cross-platform server-to-client synchronization based on Sync ID is described in this article. For specifics of the platform code refer to these articles:
The Engine is designed to publish data from almost any storage, such as cloud storage, document management systems, databases, etc. In this article we will use Serv object that reads data from your remote storage and returns it to the Engine for demo purposes. In your real-life application you will replace it with requests to your remote storage using your API.
Core Interfaces
There are 2 types of items in a virtual file system: folders, represented by IFolder interface and files represented by IFile interface. Both IFolder and IFile interfaces are derived from IFileSystemItem interface that provides common methods and properties.
Files and folders are generated by the factory method called GetFileSystemItemAsync() that you must override in your class derived from EngineWindows or EngineMac. Below is a typical structure of classes in your project:
Remote Storage Item ID
Each file and folder in the User File System can store an item identifier that helps to link user file system item to your remote storage item, called remote storge item ID.
You will set the remote storage item ID for the root folder prior to starting the Engine using the IEngine.SetRemoteStorageRootItemId() method. The Engine will pass the ID into each IEngine.GetFileSystemItemAsync() method call, so you can identify your item and perform calls to your remote storage.
using (engine = new VirtualEngine("C:\\Users\\User1\\UFS\\", ...) { engine.SetRemoteStorageRootItemId(remoteStorageItemId); await engine.StartAsync(); // Start processing file system calls. Console.ReadKey(); // Keep Engine running. }
You will supply the remote storage item ID to the Engine in the following cases:
- Before starting the Engine, for the root folder. You will use the SetRemoteStorageRootItemId() method to set the root folder ID.
- When listing folder content. You will set the ID of each child item that you return to the Engine from IFolder.GetChildrenAsync() method. See Implementing Folder Content Listing.
- After creating files and folders. You will return the new item ID from the IFolder.CreateFileAsync() and IFolder.CreateFolderAsync() methods. See Creating Files and Folders in the Remote Storage.
- With each item if you implement in Sync ID algorithm (optional). Each modified item returned by your remote storage will contain item ID and parent item ID. See Sync ID Algorithm.
Note that the remote storage item ID must be unique withing your file system and can NOT change during lifetime of the item, including during move operation.
Implementing Factory Method
When a platform receives a call via its file system API, such as folder listing or reading file content, the Engine will first request a folder or file item by calling the IEngine.GetFileSystemItemAsync() virtual factory method. The Engine will pass the remote storage item ID and an item type (file or folder) as parameters. After that, the Engine will call IFile or IFolder interface methods on the item returned by GetFileSystemItemAsync().
Create a VirtualEngine class in your project and derive it from EngineWindows or EngineMac class. Then override the GetFileSystemItemAsync() method, which will return your files and folders:
public class VirtualEngine : EngineWindows { public override async Task<IFileSystemItem> GetFileSystemItemAsync( byte[] remoteStorageItemId, FileSystemItemType itemType, IContext context, ILogger logger = null) { if (itemType == FileSystemItemType.File) { return new VirtualFile(remoteStorageItemId); } else { return new VirtualFolder(remoteStorageItemId); } } }
Note that typically, for performance reasons, you should NOT make any server calls inside your GetFileSystemItemAsync() method implementation. Instead, you will just create an item and return it to the Engine. You will make server calls inside your IFile and IFolder method implementations.
Depending on the platform, you can get additional data inside the GetFileSystemItemAsync() call. For example on Windows you can get the user file system path, which is passed in context parameter. However, on other platforms, such as macOS and iOS path is not available.
string userFileSystemPath = context.FileNameHint;
Folder Listing (Population)
During the initial Engine start, the content of the root folder and all underlying folders is unknown. When the folder listing is performed, for example because you started browsing your virtual file system in OS file manager, or because an application opens a file on your disk, the OS API lists the content of the folder. At this point the platform blocks listing, until the folder content is populated.
When the platform makes a folder listing request, the Engine calls the IFolder.GetChildrenAsync() method. In your implementation, you will call your remote storage, create a list with information about files and folders, and return it to the Engine by calling the IFolderListingResultContext.ReturnChildrenAsync() method. Below is a sample GetChildrenAsync() method implementation:
public class VirtualFolder: IFolder { private readonly byte[] RemoteStorageId; public VirtualFolder(byte[] remoteStorageItemId) { RemoteStorageId = remoteStorageItemId; } public async Task GetChildrenAsync( string pattern, IOperationContext operationContext, IFolderListingResultContext resultContext, CancellationToken cancellationToken) { var remoteStorageChildren = await Serv.GetChildrenAsync(RemoteStorageId); var userFileSystemChildren = new List<IFileSystemItemMetadata>(); foreach (var remoteStorageItem in remoteStorageChildren) { IFileSystemItemMetadata itemInfo; if (remoteStorageItem is ServFile) { // The item is a file. itemInfo = new FileMetadata(); ((IFileMetadata)itemInfo).Length = remoteStorageItem.ContentLength; ((IFileMetadata)itemInfo).ContentETag = remoteStorageItem.ContentETag; itemInfo.Attributes = FileAttributes.Normal; } else { // The item is a folder. itemInfo = new FolderMetadata(); itemInfo.Attributes = FileAttributes.Normal | FileAttributes.Directory; } itemInfo.RemoteStorageItemId = remoteStorageItem.Id; itemInfo.MetadataETag = remoteStorageItem.MetadataETag; itemInfo.Name = remoteStorageItem.DisplayName; itemInfo.CreationTime = remoteStorageItem.CreationDate; itemInfo.LastWriteTime = remoteStorageItem.LastModified; itemInfo.LastAccessTime = remoteStorageItem.LastModified; itemInfo.ChangeTime = remoteStorageItem.LastModified; userFileSystemChildren.Add(itemInfo); } await resultContext.ReturnChildrenAsync( userFileSystemChildren.ToArray(), userFileSystemChildren.Count()); } ... }
After the GetChildrenAsync() call, you will see the cloud icon ( on Windows and on macOS) next to each file in file manager, meaning the file does not contain any content - the file is dehydrated. Even though files report correct file size and all platform file API treats such files as a regular files. On Windows platform dehydrated files and depopulated folders are also marked with an offline attribute.
Each item in the list returned to the Engine must implement IFileMetadata or IFolderMetadata interface. The User File System library provides FileMetadata and FolderMetadata classes that you can use out of the box in many cases. IFileMetadata or IFolderMetadata interfaces represent basic information about each file and folder as well as they contain remote storage item ID, content eTag and metadata eTag. eTags are an important part of the synchronization process and allow updating content and metadata independently. See Detecting Content and Metadata Changes article.
Folder Content Synchronization
On Windows platform the GetChildrenAsync() method is called only one time during the initial on-demand population (unless you implement streaming mode). After the initial call, you will update the folder content using one of the approaches described in Incoming Synchronization Modes article. On macOS, the platform invalidates data from time to time and the GetChildrenAsync() method may be called more than one time for the same folder. Even though, the incoming synchronization is typically required on macOS too.
Listing Large Folders
The Engine is designed to be able to list folders with large amount of items in it. To support large folders population you can break folder contents into pages and return it in several turns, buy calling ReturnChildrenAsync() method multiple times, until all items are retuned. The platform will make items returned during each call available immediately, while more children will be loaded inside your GetChildrenAsync() implementation. Including the returned children will become visible in OS file manager. To specify the total number of items in your folder, the ReturnChildrenAsync() method provides a second parameter, so the platform knows when the enumeration is completed.
Reading File Content (Hydration)
The process of downloading file content from the remote storage is called Hydration. When files are initially synched from the remote storage to the User File System during the IFolder.GetChildrenAsync() method call, they do not have any content on disk. When any application opens a file handle to access the file, the platform detects that the file is dehydrated and blocks opening until the requested segment or entire file (depending on your virtual file system mode and file API call parameters) is returned to the platform. At this moment the Engine calls the IFile.ReadAsync() method passing offset and a length of the block of the file content requested by the platform. It also passes the output stream to which you will write the data. Below we provide an example of the ReadAsync() method implementation:
public class VirtualFile: IFile { protected readonly byte[] RemoteStorageId; public VirtualFile(byte[] remoteStorageItemId) { RemoteStorageId = remoteStorageItemId; } public async Task<IFileMetadata> ReadAsync( Stream output, long offset, long length, ITransferDataOperationContext operationContext, ITransferDataResultContext resultContext, CancellationToken cancellationToken) { const int bufferSize = 0x500000; // 5Mb. using (ServStream stream = await Serv.GetDownloadStreamAsync( RemoteStorageId, offset, length, cancellationToken)) { await stream.CopyToAsync(output, bufferSize, length, cancellationToken); return new FileMetadata() { ContentETag = ServStream.ContentETag, MetadataETag = ServStream.MetadataETag }; } } ... }
Writing Output Stream
Taking into account offset and length of data being request is vital for the platform. Returning incorrect offset or data length will result in a corrupted file. As soon as the regular Stream.CopyToAsync() method does not support the length of the data to be copied, the User File System library provides the CopyToAsync() extension method used in the example above, to simplify writing.
Download Progress
During the download process, the platform automatically calculates and displays the download progress:
Optionally you can also call the IResultContext.ReportProgress() method to report download progress to the platform. The IResultContext is provided via the resultContext parameter.
Method Result
If the method completes without exceptions the file is marked as in-sync ( icon on Windows or no icon on macOS). Otherwise, the file is left in the dehydrated state and displays cloud icon ( on Windows and on macOS).
Restarting Hydration
If the method failed and the platform restarts the hydration (for example because user double-clicked on a file in OS file manger) and a part of a content was successfully saved on the client, the Engine will restart download from next byte successfully saved on the client, passing the offset parameter value.
Content eTag and Metadata eTag
The ReadAsync() method returns IFileMetadata that contains updated content eTag and metadata eTag. These eTags are stored on the client until the next IFile.WriteAsync() call or until IFolder.GetChangesAsync() call.
In your WriteAsync() implementation you will send eTag(s) to the remote storage as part of the content update, to make sure server content is not overwritten. During the GetChangesAsync() method call the Engine will compare stored eTags with eTags received from remote storage to see if item content and metadata should be updated.
See Detecting Content and Metadata Changes article.
Files Pinning
Hydrated files, marked with icon, can be purged from the file system in case there is not enough space on the disk. To avoid this, the user can "pin" the file by calling the "Always keep on this device" menu in Windows File Manager. On Windows such files are marked with Pinned attribute and will remain in the file system regardless of the remaining disk space.
Writing File Content
When the file content or file metadata is modified and needs to be uploaded to the remote storage the Engine calls IFile.WriteAsync() method. The Engine passes updated metadata and a file content stream as parameters.
public class VirtualFile: IFile { ... public async Task<IFileMetadata> WriteAsync( IFileMetadata metadata, Stream content = null, IOperationContext operationContext = null, IInSyncResultContext inSyncResultContext = null, CancellationToken cancellationToken = default) { if (content != null) { // Update remote storage file content and metadata. var res = await Serv.UploadAsync( RemoteStorageId, content, // Send old content eTag to the server. metadata.ContentETag, metadata.Attributes, metadata.CreationTime.UtcDateTime, metadata.LastWriteTime.UtcDateTime, metadata.LastAccessTime.UtcDateTime, metadata.LastWriteTime.UtcDateTime, cancellationToken); // Return new eTags to the Engine. return new FileMetadata() { ContentETag = res.ContentEtag, MetadataETag = res.MetadataETag }; } } }
Content Stream Parameter
The content parameter contains stream that you will upload to your server. It can be null in the following cases:
- If the file is blocked and the Engine can not open the file for reading.
- If the file metadata is modified but file content is NOT modified.
In this cases you still may want to send file metadata to your remote storage or you can ignore the call as in the above example.
Method Result
If the method completes without exceptions the file is marked as in-sync. Otherwise, the file is left in the not in-sync state.
Content eTag and Metadata eTag
If you stored content and metadata eTags during your ReadAsync() call, you can read them from ContentETag and MetadataETag properties of metadata parameter in WriteAsync() call. You will attach eTag(s) to your request to be sent to your remote storage as part of the update request.
Your remote storage will return a new eTag(s) that you will return to the Engine as return value of the WriteAsync() method. They will be stored with the item until the next update or synchronization.
Creating Files
When a file or is being created in the user file system the Engine calls IFolder.CreateFileAsync(). The Engine passes a new item metadata and, a content stream as parameters. Below is an example of CreateFileAsync() method implementation:
public class VirtualFolder: IFolder { ... public async Task<IFileMetadata> CreateFileAsync( IFileMetadata metadata, Stream content = null, IOperationContext operationContext = null, IInSyncResultContext inSyncResultContext = null, CancellationToken cancellationToken = default) { var res = await Serv.NewFileAsync( RemoteStorageId, // This folder remote storage ID (parent ID). content, metadata.Attributes, metadata.CreationTime.UtcDateTime, metadata.LastWriteTime.UtcDateTime, metadata.LastAccessTime.UtcDateTime, metadata.LastWriteTime.UtcDateTime, cancellationToken); return new FileMetadata() { RemoteStorageItemId = res.RemoteStorageId, ContentETag = res.ContentETag, MetadataETag = res.MetadataETag, ... }; } }
Content Stream Parameter
The content parameter contains stream that you will upload to your server. If the file is still blocked a null will be passed. In some cases you still may want to send file metadata to your remote storage and create a 0-length file. Or you can you can throw exception to indicate that the file creation failed.
Method Result
If the method completes without exceptions the file is marked as in-sync. Otherwise, the file is left in the not in-sync state. In this case the CreateFileAsync() method will be called again during next synchronization event.
On Windows platform, if the method completes without exceptions, the file is converted into a placeholder. Otherwise, the file remains a regular file.
ID, Content eTag and Metadata eTag
Your remote storage will return a new file remote storage item ID and eTag(s) (version(s) in terms of macOS). You will return eTag(s) to the Engine as part of the return value of this method.
Creating Folders
When a folder is being created in the User File System the Engine calls IFolder.CreateFolderAsync() method:
public class VirtualFolder: IFolder { ... public async Task<IFolderMetadata> CreateFolderAsync( IFolderMetadata metadata, IOperationContext operationContext, IInSyncResultContext inSyncResultContext = null, CancellationToken cancellationToken = default) { var res = await Serv.NewFolderAsync( RemoteStorageId, // This folder remote storage ID (parent ID). metadata.Attributes, metadata.CreationTime.UtcDateTime, metadata.LastWriteTime.UtcDateTime, metadata.LastAccessTime.UtcDateTime, metadata.LastWriteTime.UtcDateTime, cancellationToken); return new FileMetadata() { RemoteStorageItemId = res.RemoteStorageId, MetadataETag = res.MetadataETag, ... }; } }
Method Result
If the method completes without exceptions the folder is marked as in-sync. Otherwise, the folder is left in the not in-sync state. In this case this method will be called again during next synchronization event.
On Windows platform, if the method completes without exceptions, the folder is converted into a placeholder. Otherwise, the folder remains a regular folder.
ID and Metadata eTag
Your remote storage will return a new folder remote storage item ID and metadata eTag(s) (version(s), in terms of macOS). You will return eTag(s) to the Engine as part of the return value of this method.