[Bug] Memory usage when upload file to DataLake gen2
See original GitHub issueQuery/Question I’m using Azure.Storage.Files.DataLake to send data every minute from my C# code.
public class DataLakeStorage : IFileStorage
{
private readonly DataLakeServiceClient _dataLakeClient;
public DataLakeStorage(FileStorageOptions options)
{
var sharedKeyCredentials = new StorageSharedKeyCredential(options.AccountName, options.AccountKey);
_dataLakeClient = new DataLakeServiceClient(new Uri($"https://{options.AccountName}.dfs.core.windows.net"), sharedKeyCredentials);
}
public async Task Add(string containerName, string path, string name, object data)
{
DataLakeFileSystemClient fileSystemClient = _dataLakeClient.GetFileSystemClient(containerName.ToLower());
await fileSystemClient.CreateIfNotExistsAsync();
DataLakeDirectoryClient directoryClient = fileSystemClient.GetDirectoryClient(path);
DataLakeFileClient fileClient = await directoryClient.CreateFileAsync(name);
await using var stream = new MemoryStream();
await JsonSerializer.SerializeAsync(stream, data, data.GetType());
stream.Position = 0;
await fileClient.AppendAsync(stream, 0);
await fileClient.FlushAsync(stream.Length);
}
}
DataLakeStorage
is registered as a Singleton in web application.
When I start using this code to send data every minute, I got a lot of memory usage of my application (each file ~150 bytes).
App without sending to datalake after 3 days uptime: ~170Mb App with sending to datalake after 6 hours uptime: ~900Mb
Where can be the issue? Could you please suggest what I’m doing wrong with sending data to datalake?
UPDATE
Also, tried await fileClient.UploadAsync(stream, overwrite: true);
instaed of
await fileClient.AppendAsync(stream, 0);
await fileClient.FlushAsync(stream.Length);
and added
<PropertyGroup>
<ServerGarbageCollection>false</ServerGarbageCollection>
</PropertyGroup>
but the same result. It uses a lot of memory.
Environment: Package: Azure.Storage.Files.DataLake 12.4.0 App: netcoreapp3.1 OS: Linux in container
Issue Analytics
- State:
- Created 3 years ago
- Comments:7 (6 by maintainers)
Top Results From Across the Web
[Bug] Memory usage when upload file to DataLake gen2
[Bug] Memory usage when upload file to DataLake gen2 #2140 ... I'm using Azure.Storage.Files.DataLake to send data every minute from my C# ...
Read more >Known issues with Azure Data Lake Storage Gen2
You can't use blob APIs, NFS 3.0, and Data Lake Storage APIs to write to the same instance of a file. If you...
Read more >Uploading folders on Azure Datalake storage failed using ...
Uploading files using Azure Portal is easiest and reliable option. I'm not sure what exactly wrong you are doing assuming you have reliable ......
Read more >How to Create Azure Data Lake Storage Gen2 & Copy Files ...
How to Create Azure Data Lake Storage Gen2 & Copy Files From Blob ... create Azure Data Lake Gen 2 Storage Upload files...
Read more >Azure Data Lake Storage Gen2
On the File URL tab, enter URL for the file. About Azure storage accounts. When you use Tableau with Azure Data Lake Storage...
Read more >Top Related Medium Post
No results found
Top Related StackOverflow Question
No results found
Troubleshoot Live Code
Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start FreeTop Related Reddit Thread
No results found
Top Related Hackernoon Post
No results found
Top Related Tweet
No results found
Top Related Dev.to Post
No results found
Top Related Hashnode Post
No results found
Top GitHub Comments
Hi @Marusyk. Unfortunately, I’m only serving as first triage in this case and have no insight into the status of this nor process that the owning team uses for triage. I’ll need to defer to @sumantmehtams and @xgithubtriage for assistance.
Hi @sumantmehtams and @xgithubtriage any comments from your side?