It is desired to read and write big files via InputStream and OutputStream while having mark/reset supported in a most efficient way. BufferedInputStream, which does support mark/reset, can only handle up to 2MiB files due to byte[] usage. This is even more restricted on some platforms, since it uses heap memory which might be not available. Further, performance is not ideal. +++ Add memory mapped InputStream and OutputStream implementations supporting mark/reset.
commit ae17a5895088e321bc373318cc1e144a2f822f29 Add read support for memory mapped big file I/O via specialized InputStream impl., incl. mark/reset - ByteBufferInputStream simply impl. InputStream for an arbitrary 2MiB restricted ByteBuffer - Users may only need a smaller implementation for 'smaller' file sizes or for streaming a [native] ByteBuffer. - MappedByteBufferInputStream impl. InputStream for any file size, while slicing the total size to memory mapped buffers via the given FileChannel. The latter are mapped lazily and diff. flush/cache methods are supported to ease virtual memory usage. - TestByteBufferInputStream: Basic unit test for basic functionality and perf. stats.
95c4a3c7b6b256de4293ed1b31380d6af5ab59d0 Fix TestByteBufferInputStream: Handle OutOfMemoryError cause in IOException (Add note to FLUSH_NONE); Reduce test load / duration.
92a6d2c1476fd562721f231f89afba9342ed8a20 Bug 1080 - Add write support for memory mapped big file I/O via specialized OutputStream impl. Added MappedByteBufferOutputStream as a child instance of MappedByteBufferInputStream, since the latter already manages the file's mapped buffer slices. Current design is: - MappedByteBufferInputStream (parent) - MappedByteBufferOutputStream this is due to InputStream and OutputStream not being interfaces, but most functionality is provided in one class. We could redesign both as follows: - MappedByteBufferIOStream (parent) - MappedByteBufferInputStream - MappedByteBufferOutputStream This might visualize things better .. dunno whether its worth the extra redirection. +++ MappedByteBufferInputStream: - Adding [file] resize support via custom FileResizeOp - All construction happens via ctors - Handle refCount, incr. by ctor and getOutputStream(..), decr by close - Check whether stream is closed already -> IOException - Simplify / Reuse code MappedByteBufferOutputStream: - Adding simple write operations
Basic functionality now added incl. unit tests passed on Windows and GNU/Linux 32- and 64bit using JRE7 and JRE8 (Oracle/OpenJDK). Further refinements may happen via a followup bug report.
To render the MappedByteBuffer*Stream more useful, we might add JNI native mmap and munmap ? This would enhance 'flushing' of a mapped buffer slice and hopping to the next. Right now, we use an array of slices, but native mmap/munmap could remove such use, map the current 'window' directly and also ensuring the unmap and hence release. Currently the unmap is only impl. in a fuzzy way, i.e. via GC or private 'cleaner' method. +++ Also a r/w method using ByteBuffers might seem useful as well. +++
commit 00a9ee70054872712017b5a14b19aa92068c8420 Refine MappedByteBuffer*Stream impl. and API [doc], adding stream to stream copy as well as direct memory mapped ByteBuffer access
a7a3d5ab98ee0ad33fdef50bf081afeb8295ebe4 - Validate active and GC'ed mapped-buffer count in cleanAllSlices() via close() .. - Fix missing unmapping last buffer in notifyLengthChangeImpl(), branch criteria was off by one. - cleanSlice(..) now also issues cleanBuffer(..) on the GC'ed entry, hence if WeakReference is still alive, enforce it's release. - cleanBuffer(..) reverts FLUSH_PRE_HARD -> FLUSH_PRE_SOFT in case of an error. - flush() -> flush(boolean metaData) to expose FileChannel.force(metaData). - Add synchronous mode, flushing/syncing the mapped buffers when in READ_WRITE mapping mode and issue FileChannel.force() if not READ_ONLY. Above is implemented via flush()/flushImpl(..) for buffers and FileChannel, as well as in syncSlice(..) for buffers only. flush*()/syncSlice() is covered by: - setLength() - notifyLengthChange*(..) - nextSlice() Always issue flushImpl() in close(). - Windows: Clean all buffers in setLength(), otherwise Windows will report: - Windows: Catch MappedByteBuffer.force() IOException - Optimization of position(..) position(..) is now standalone to allow issuing flushSlice(..) before gathering the new mapped buffer. This shall avoid one extra cache miss. Hence rename positionImpl(..) -> position2(..). - All MappedByteBufferOutputStream.write(..) methods issue syncSlice(..) on the last written current slice to ensure new 'synchronous' mode is honored. +++ Unit tests: - Ensure test files are being deleted - TestByteBufferCopyStream: Reduced test file size to more sensible values.
bd240ebfe09b7c7a21689dee8be0cc673eb7f340 MappedByteBufferInputStream: Default CacheMode is FLUSH_PRE_HARD now (was FLUSH_PRE_SOFT) FLUSH_PRE_SOFT cannot be handled by some platforms, e.g. Windows 32bit. FLUSH_PRE_HARD is the most reliable caching mode and it will fallback to FLUSH_PRE_SOFT if no method for 'cleaner' exists. Further, FLUSH_PRE_HARD turns our to be the fastest mode as well.