[
  {
    "Id": "435109",
    "ThreadId": "210462",
    "Html": "<p>Question, I am attempting to zip large amounts of data automatically on a schedule, the resulting zip file would be too large to store in memory (GBs). Is it possible to stream the resulting zip file out to a stream&nbsp;object as it is generated, and not to store the file contents in memory and then save it all at once? So something like...</p>\r\n<p>I'm new to this library, so please bare with me...</p>\r\n<p>&nbsp;using (ZipFile zip = new ZipFile())<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; {<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; zip.SetOutputStream(xxx);</p>\r\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; zip.AddEntry(xxx,yyy);<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; zip.AddEntry(xxx,yyy);<br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; zip.AddEntry(xxx,yyy);</p>\r\n<p>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; zip.Flush();</p>\r\n<p><br>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; }</p>\r\n<p>&nbsp;</p>",
    "PostedDate": "2010-04-25T00:32:13.553-07:00",
    "UserRole": null,
    "MarkedAsAnswerDate": null
  },
  {
    "Id": "435463",
    "ThreadId": "210462",
    "Html": "<p>IIRC, the library supports deferred adding of data - if you add an entry with a stream or a WriteDelegate, the library reads the data of the individual items while saving, so memory usage should be moderate.</p>",
    "PostedDate": "2010-04-26T05:25:10.14-07:00",
    "UserRole": null,
    "MarkedAsAnswerDate": null
  },
  {
    "Id": "435618",
    "ThreadId": "210462",
    "Html": "<p>Yes, you are correct hardcodet, there are options for deferring.</p>\r\n<h2>Two interfaces: ZipFile and ZipOutputStream</h2>\r\n<p>First, there are two main interfaces for creating zip fiels in DotNEtZip.&nbsp; One is the ZipFile class, which you've seen.&nbsp; The other is the ZipOutputStream class.&nbsp; Let's take them in reverse.</p>\r\n<h3>ZipOutputStream</h3>\r\n<p>Using <a href=\"http://cheeso.members.winisp.net/DotNetZipHelp/html/776a5035-37e3-4fb2-d76e-0a52e1421581.htm\">ZipOutputStream</a>, you treat the zip as a writable stream.&nbsp; Your code creates this stream, and then for each entry you want to appear in the zip file, specify the entry name, then write the data for that entry into the stream.&nbsp; As the data is written it is&nbsp;zipped.&nbsp; It is a full streaming model.&nbsp; It is a forward-only write-stream.&nbsp; The code for that looks like this:</p>\r\n<div style=\"border:solid .1em #ccc;color:black;background-color:white;margin:.25em 0.5em 0 0.5em;padding:0.25em 1.75em 0.25em 1.25em\">\r\n<pre><span style=\"color:blue\">using</span> (<span style=\"color:blue\">var</span> output= <span style=\"color:blue\">new</span> ZipOutputStream(outputFileName))\r\n{\r\n    output.Password = <span style=\"color:#a31515\">&quot;VerySecret!&quot;</span>;\r\n    output.Encryption = EncryptionAlgorithm.WinZipAes256;\r\n\r\n    <span style=\"color:blue\">foreach</span> (<span style=\"color:blue\">string</span> inputFileName <span style=\"color:blue\">in</span> filesToZip)\r\n    {\r\n        System.Console.WriteLine(<span style=\"color:#a31515\">&quot;file: {0}&quot;</span>, inputFileName);\r\n\r\n        output.PutNextEntry(inputFileName);\r\n        <span style=\"color:blue\">using</span> (<span style=\"color:blue\">var</span> input = File.Open(inputFileName, FileMode.Open, FileAccess.Read,\r\n                                     FileShare.Read | FileShare.Write ))\r\n        {\r\n            <span style=\"color:blue\">byte</span>[] buffer= <span style=\"color:blue\">new</span> <span style=\"color:blue\">byte</span>[2048];\r\n            <span style=\"color:blue\">int</span> n;\r\n            <span style=\"color:blue\">while</span> ((n= input.Read(buffer,0,buffer.Length)) &gt; 0)\r\n            {\r\n                output.Write(buffer,0,n);\r\n            }\r\n        }\r\n    }\r\n}</pre>\r\n</div>\r\n<p>This is a handy model, and may be what you want.&nbsp; It isn't satisfactory for all uses, though, because this model requires the use of bit 3, which is a part of the zip spec that somehow isn't supported on some platforms, or by some tools.&nbsp; It's really not that exotic, so I don't know why, but in any case some 3rd party tools will choke when consuming zips produced in this way.&nbsp;&nbsp; The other attribute of the ZipOutputStream is that it is an output stream only. there's no support for updating a zip file, or for random access, or simply reading a zip file.&nbsp;</p>\r\n<h3>ZipFile</h3>\r\n<p>The alternative when producing a zipfile is to use the <a href=\"http://cheeso.members.winisp.net/DotNetZipHelp/html/547e4c24-4683-96df-036e-19bc34ba27e4.htm\">ZipFile</a> class.&nbsp; The main verbs you use with ZipFile are AddFile and AddEntry.&nbsp; Contrary to your suggestion, in most cases using these verbs does not cause the entire contents of the entries to be stored in memory at any one time, ever.&nbsp; If you are adding files via AddFile, at the time of the call to ZipFile.AddFile, DotNetZip stores in memory the metadata about the entry - the name, whether it will use encryption or not, where the data will come from when the zipFile is eventually saved, and so on.&nbsp;&nbsp;Regardless of the size of the file you are adding, what's stored in memory is generally less than 256 bytes worth of data. At the time of ZipFile.Save, the source file is read and its data compressed and written to the zip, in a streaming manner. So, the entire contents of the entry is never held in memory.&nbsp;</p>\r\n<p>The only exception to that, is when calling AddEntry() with a byte array or a string defining the content of the entry to be written into the zipfile.&nbsp; The overloads accepting these types of inputs&nbsp;are intended to support the insertion of entries in the zip file with dynamically-sourced content&nbsp;- for example a readme.txt file that contains a few lines of text.&nbsp; You can do this with the AddEntry overload that accepts a string. Of course in this case the entire string is in memory at one time.</p>\r\n<div style=\"border:solid .1em #ccc;color:black;background-color:white;margin:.25em 0.5em 0 0.5em;padding:0.25em 1.75em 0.25em 1.25em\">\r\n<pre>Stream stream = ObtainStreamFromSomewhere(); \r\n<span style=\"color:blue\">using</span> (ZipFile zip = <span style=\"color:blue\">new</span> ZipFile())\r\n{\r\n  <span style=\"color:green\">// The content for this entry will be read from a filesystem file,</span>\r\n  <span style=\"color:green\">// at the time of the call to Save(). </span>\r\n  <span style=\"color:green\">// The entire data from the file is never held in memory.</span>\r\n  zip.AddFile(<span style=\"color:#a31515\">&quot;C:\\\\whatever\\\\Name-of-Entry1.txt&quot;</span>, <span style=\"color:#a31515\">&quot;files&quot;</span>); \r\n\r\n  <span style=\"color:green\">// The contents for this entry will be read from the provided stream,</span>\r\n  <span style=\"color:green\">// at the time of the call to Save().</span>\r\n  <span style=\"color:green\">// The entire contents of the stream is never held in memory.  </span>\r\n  zip.AddEntry(<span style=\"color:#a31515\">&quot;files\\\\Name-of-Entry2.bin&quot;</span>, stream);\r\n\r\n  <span style=\"color:green\">// This content for this entry will be obtained from a string. </span>\r\n  <span style=\"color:green\">// Obviously, the string is held in memory.  </span>\r\n  zip.AddEntry(<span style=\"color:#a31515\">&quot;files\\\\Readme.txt&quot;</span>,<span style=\"color:#a31515\">&quot;this is the content of the readme entry&quot;</span>);\r\n\r\n  <span style=\"color:green\">// Save() will read data from each of the above sources and</span>\r\n  <span style=\"color:green\">// write to the provided zip file, in a streaming fashion.</span>\r\n  zip.Save(<span style=\"color:#a31515\">&quot;c:\\\\archive.zip&quot;</span>);\r\n}</pre>\r\n</div>\r\n<h3>The WriteDelegate</h3>\r\n<p>In the prior email, hardcodet suggested an option, the <a href=\"http://cheeso.members.winisp.net/DotNetZipHelp/html/e20db3f7-587b-2457-3ea5-d25e7b1bf68d.htm\">WriteDelegate</a>.&nbsp; WriteDelegate is a different source that you can use with the ZipFile class. The use of the WriteDelegate may be interesting to you, but it is really orthogonal to the issue of whether the entry data is stored in memory at one time.&nbsp; As I said above, allowing 2 exceptions, the content for entries is never stored in memory.&nbsp;</p>\r\n<p>So what does the WriteDelegate do?&nbsp;&nbsp; It switches from a pull model to a push model.&nbsp; What I mean is this:&nbsp; When calling ZipFile.AddFile or ZipFile.AddEntry, your app provides a source for entry data to DotNetZip.&nbsp; This source might be a filesystem file, a stream, a string, or a byte array.&nbsp; When your app calls ZipFile.Save(), DotNetZip then retrieves the data for each entry, from the source you provided.&nbsp; This is what I might call &quot;Pull&quot;.&nbsp; DotNetZip reads the source your app provided.&nbsp;</p>\r\n<p>There are some cases where the application does not have a source from which DotNetZip can &quot;pull&quot; content.&nbsp; An example is a .NET DataSet.&nbsp; There's a nice WriteXml() method on the DataSet class, which can write an xml representation of a dataset into a stream.&nbsp; But there's no way to get a stream for the dataset, if you see what I mean.&nbsp; The DataSet can write to a sink, but cannot act as a source of data.&nbsp;&nbsp; The WriteDelegate solves that problem.&nbsp; The way it works:&nbsp; at the time of ZipFile.Save, for each entry that has a WriteDelegate as a source, DotNetZip will invoke your application code, and allow your code to write directly into the zip stream.&nbsp; It's something like the model for ZipOutputStream, if that makes sense, but just for a single entry.&nbsp; The code looks like this:</p>\r\n<div style=\"border:solid .1em #ccc;color:black;background-color:white;margin:.25em 0.5em 0 0.5em;padding:0.25em 1.75em 0.25em 1.25em\">\r\n<pre><span style=\"color:blue\">private</span> <span style=\"color:blue\">void</span> WriteEntry (String filename, Stream output)\r\n{\r\n    DataSet ds1 = ObtainDataSet();\r\n    ds1.WriteXml(output);\r\n}\r\n\r\n<span style=\"color:blue\">private</span> <span style=\"color:blue\">void</span> Run()\r\n{\r\n    <span style=\"color:blue\">using</span> (<span style=\"color:blue\">var</span> zip = <span style=\"color:blue\">new</span> ZipFile())\r\n    {\r\n        zip.AddEntry(zipEntryName, WriteEntry);\r\n        zip.Save(zipFileName);\r\n    }\r\n}</pre>\r\n</div>\r\n<p>All of this is described in various places in the fairly complete reference that is available at <a href=\"http://dotnetzip.codeplex.com/documentation\">http://dotnetzip.codeplex.com/documentation</a>&nbsp;.&nbsp;&nbsp;</p>\r\n<p>I've been meaning to write some programming guide material, to complement that reference.&nbsp; &nbsp;The workitem for that is <a href=\"http://dotnetzip.codeplex.com/WorkItem/View.aspx?WorkItemId=9032\">http://dotnetzip.codeplex.com/WorkItem/View.aspx?WorkItemId=9032</a>&nbsp; This response is the kind of information I would put in that programming-guide document page.</p>\r\n<p>&nbsp;</p>",
    "PostedDate": "2010-04-26T11:46:20.69-07:00",
    "UserRole": null,
    "MarkedAsAnswerDate": null
  }
]