{
  "WorkItem": {
    "AffectedComponent": {
      "Name": "",
      "DisplayName": ""
    },
    "ClosedComment": "",
    "ClosedDate": null,
    "CommentCount": 0,
    "Custom": null,
    "Description": "The file size we are talking about here is larger than 10GB. We have a backup solution that utilizes DotNetZip. Recently we tried backing up/compressing several WTV (Windows Recorded TV Shows) files. The compression went fine with no exceptions or errors. Then for a quick test of reliability we tried extracting one of the files. Here is where the weird thing happens. Extracting it with DotNetZip went fine with no exceptions, however comparing the extracted file content with the original showed a great number of byte differences scattered around the file in various offsets. Tried playing the extracted file and it played 1% of the file and halted. So we tried another file and after comparing the content, there was a single byte difference though playing the file with VLC went well. Another file and again a single byte difference with no playback issues. The file sizes in question are 22GB, 15GB and 11GB respectively.\n\nWe tried extracting all three files using both WinRar and 7-Zip and all files failed to extract with both applications showing a check-sum error at the end of the extraction process.\n\nI am wondering if DotNetZip has its shortcomings when dealing with large files. Or if there is some trick or tweak one might use to make it work with large files?\n\nI just want to add that DotNetZip does not have issues with files less than 10 GB, we tried with an 8GB, 4GB, 2GB, 1GB and some other files ranging between KBs to hundreds of MBs with no issues compressing or extracting and all files were verified against the original after extraction with no issues.",
    "LastUpdatedDate": "2014-03-31T06:12:42.21-07:00",
    "PlannedForRelease": "",
    "ReleaseVisibleToPublic": false,
    "Priority": {
      "Name": "Unassigned",
      "Severity": 0,
      "Id": 0
    },
    "ProjectName": "DotNetZip",
    "ReportedDate": "2014-03-26T08:00:31.147-07:00",
    "Status": {
      "Name": "Proposed",
      "Id": 1
    },
    "ReasonClosed": {
      "Name": "Unassigned"
    },
    "Summary": "issues with large files",
    "Type": {
      "Name": "Unassigned",
      "Id": 5
    },
    "VoteCount": 1,
    "Id": 16786
  },
  "FileAttachments": [],
  "Comments": [
    {
      "Message": "There are some potential file corruption issues with the default optimization ParallelDeflateThreshold  turned on. You may want to test by zipping with ParallelDeflateThreshold = -1 to see if that impacts the results.",
      "PostedDate": "2014-03-27T06:28:30.527-07:00",
      "Id": -2147483648
    },
    {
      "Message": "I am using the default value which is already ParallelDeflateThreshold = -1",
      "PostedDate": "2014-03-27T06:46:18.933-07:00",
      "Id": -2147483648
    },
    {
      "Message": "Actually the only two options I am explicitely setting are:\r\n\r\nEnableZip64 = Zip64Option.Always\nCompressionLevel = Ionic.Zlib.CompressionLevel.Default",
      "PostedDate": "2014-03-27T06:48:38.01-07:00",
      "Id": -2147483648
    },
    {
      "Message": "Can you point me to where you see that the default value is already ParallelDeflateThreshold = -1 ? There were suggestions that its default be changed to that due to the bug but I have not seen where that has occurred. The  [v1.9.1.6 doc](http://dotnetzip.herobo.com/DNZHelp/Index.html) says \" The default value for this property is 512k. \" and the v1.9.1.8 source code contains\r\n\r\n#if !NETCF\n       ParallelDeflateThreshold = 512 * 1024;\n#endif\r\n\r\nin Ionic.Zip.ZipFile::_InitInstance().\n ",
      "PostedDate": "2014-03-28T06:59:05.217-07:00",
      "Id": -2147483648
    },
    {
      "Message": "It seems related to\n[https://dotnetzip.codeplex.com/workitem/14087](https://dotnetzip.codeplex.com/workitem/14087)\nI had corruption errors (not tested with big files of 10 Gb or more) but problem solved using solution of post\n>  bob0043 wrote May 3, 2013 at 8:27 PM \n> I concur with rhpainte's as to the location of the problem but differ slightly as to the analysis and fix.\r\n\r\n> The number of buffers used is partially dependent on the number of processors. Each set of buffers is handled by a separate thread. The variables _latestCompressed, _lastFilled, and _lastFilled keep track of, respectively, the last buffer that has been compressed, the last buffer that has been filled with input awaiting compression, and the last buffer written to the output.\r\n\r\n> The code is such that _latestCompressed <= _lastFilled is always true. The devil is in that \"<\" part of the expression. The EmitPendingBuffers functions, as written, is exiting when _lastWritten == _latestCompressed but that may not be the last buffer that needs to be written -- the last to be written should be _lastFilled. There is a race condition here: depending on input file size, buffer count, thread count, processor workload, and the phase of the moon, the _latestCompressed may or may not be equal to _lastFilled when EmitPendingBuffers checks it.\r\n\r\n> So the fix should be:\nchange\n```\n> } while (doAll && (_lastWritten != _latestCompressed));\n```\n> to\n```\n} while (doAll && (_lastWritten != _lastFilled));\n```\r\n\r\n> This is in function EmitPendingBuffers. In the copy of the source I have that is line 971 of ParallelDeflateOutputStream.cs. Rhpainte notes it as line 987 of the same file.\r\n\r\n> Using \"} while (doAll && (_lastWritten != _latestCompressed || _lastWritten != _lastFilled));\", as rhpainte suggested, has a an \"extra\" check: if (_lastWritten == _lastFilled) is true then (_lastWritten == _latestCompressed) is also true (only compressed buffers are written) and there is no need to check it.\r\n\r\n\nof the link above mentioned.\n",
      "PostedDate": "2014-03-31T01:20:37.51-07:00",
      "Id": -2147483648
    },
    {
      "Message": "@johnbuuck  Sorry for the late reply. The default value for ParallelDeflateThreshold is always -1 for me.\r\n\r\n```\nusing (var writer = new FileStream(sDestinationFilePath, FileMode.Create, FileAccess.ReadWrite, FileShare.ReadWrite, nCopyBufferSize))\n{\n        using (var  ZipStream = new ZipOutputStream(writer))\n        {\n             Console.WriteLine(\"ParallelDeflateThreshold: \" + ZipStream.ParallelDeflateThreshold); //This is always -1\n         }\n}\n```\r\n\r\nAnd for caution now I am always explicitly setting it to -1.",
      "PostedDate": "2014-03-31T05:11:44.697-07:00",
      "Id": -2147483648
    },
    {
      "Message": "@dhernandez My application is simpler. it is a single buffer write. I believe this might be applicable if the corruption happens right at the end of the file but sadly the case here is different as the extracted file seems to have corrupted bytes at random offsets, nevertheless thank for pointing it out.",
      "PostedDate": "2014-03-31T05:14:11.793-07:00",
      "Id": -2147483648
    },
    {
      "Message": "Oh and I am using the latest v1.9.1.8 stable release",
      "PostedDate": "2014-03-31T05:32:45.607-07:00",
      "Id": -2147483648
    },
    {
      "Message": "Sorry for the distraction. I see now that the difference is that I am using the supplied ZipIt.exe tool which itself internally uses ZipFile (which defaults ParallelDeflateThreshold to 512k) whereas you are using ZipOutputStream (which I see defaults ParallelDeflateThreshold to -1). Thanks for supplying the code so that I could see the difference.",
      "PostedDate": "2014-03-31T06:12:42.21-07:00",
      "Id": -2147483648
    }
  ]
}