It is desired to read and write big files via InputStream and OutputStream
while having mark/reset supported in a most efficient way.
BufferedInputStream, which does support mark/reset,
can only handle up to 2MiB files due to byte usage.
This is even more restricted on some platforms,
since it uses heap memory which might be not available.
Further, performance is not ideal.
Add memory mapped InputStream and OutputStream implementations
Add read support for memory mapped big file I/O via specialized InputStream impl., incl. mark/reset
- ByteBufferInputStream simply impl. InputStream for an arbitrary 2MiB restricted ByteBuffer
- Users may only need a smaller implementation for 'smaller' file sizes
or for streaming a [native] ByteBuffer.
- MappedByteBufferInputStream impl. InputStream for any file size,
while slicing the total size to memory mapped buffers via the given FileChannel.
The latter are mapped lazily and diff. flush/cache methods are supported
to ease virtual memory usage.
- TestByteBufferInputStream: Basic unit test for basic functionality and perf. stats.
Fix TestByteBufferInputStream: Handle OutOfMemoryError cause in IOException (Add note to FLUSH_NONE); Reduce test load / duration.
Bug 1080 - Add write support for memory mapped big file I/O via specialized OutputStream impl.
Added MappedByteBufferOutputStream as a child instance of MappedByteBufferInputStream,
since the latter already manages the file's mapped buffer slices.
Current design is:
- MappedByteBufferInputStream (parent)
this is due to InputStream and OutputStream not being interfaces,
but most functionality is provided in one class.
We could redesign both as follows:
- MappedByteBufferIOStream (parent)
This might visualize things better .. dunno whether its worth the
- Adding [file] resize support via custom FileResizeOp
- All construction happens via ctors
- Handle refCount, incr. by ctor and getOutputStream(..), decr by close
- Check whether stream is closed already -> IOException
- Simplify / Reuse code
- Adding simple write operations
Basic functionality now added incl. unit tests
passed on Windows and GNU/Linux 32- and 64bit
using JRE7 and JRE8 (Oracle/OpenJDK).
Further refinements may happen via a followup bug report.
To render the MappedByteBuffer*Stream more useful,
we might add JNI native mmap and munmap ?
This would enhance 'flushing' of a mapped buffer slice
and hopping to the next.
Right now, we use an array of slices,
but native mmap/munmap could remove such use,
map the current 'window' directly
and also ensuring the unmap and hence release.
Currently the unmap is only impl. in a fuzzy way,
i.e. via GC or private 'cleaner' method.
Also a r/w method using ByteBuffers might seem useful as well.
Refine MappedByteBuffer*Stream impl. and API [doc],
adding stream to stream copy
as well as direct memory mapped ByteBuffer access
- Validate active and GC'ed mapped-buffer count
in cleanAllSlices() via close() ..
- Fix missing unmapping last buffer in notifyLengthChangeImpl(),
branch criteria was off by one.
- cleanSlice(..) now also issues cleanBuffer(..) on the GC'ed entry,
hence if WeakReference is still alive, enforce it's release.
- cleanBuffer(..) reverts FLUSH_PRE_HARD -> FLUSH_PRE_SOFT
in case of an error.
- flush() -> flush(boolean metaData) to expose FileChannel.force(metaData).
- Add synchronous mode, flushing/syncing the mapped buffers when
in READ_WRITE mapping mode and issue FileChannel.force() if not READ_ONLY.
Above is implemented via flush()/flushImpl(..) for buffers and FileChannel,
as well as in syncSlice(..) for buffers only.
flush*()/syncSlice() is covered by:
Always issue flushImpl() in close().
- Windows: Clean all buffers in setLength(),
otherwise Windows will report:
- Windows: Catch MappedByteBuffer.force() IOException
- Optimization of position(..)
position(..) is now standalone to allow issuing flushSlice(..)
before gathering the new mapped buffer.
This shall avoid one extra cache miss.
Hence rename positionImpl(..) -> position2(..).
- All MappedByteBufferOutputStream.write(..) methods
issue syncSlice(..) on the last written current slice
to ensure new 'synchronous' mode is honored.
- Ensure test files are being deleted
- TestByteBufferCopyStream: Reduced test file size to more sensible values.
MappedByteBufferInputStream: Default CacheMode is FLUSH_PRE_HARD now (was FLUSH_PRE_SOFT)
FLUSH_PRE_SOFT cannot be handled by some platforms, e.g. Windows 32bit.
FLUSH_PRE_HARD is the most reliable caching mode
and it will fallback to FLUSH_PRE_SOFT if no method for 'cleaner' exists.
Further, FLUSH_PRE_HARD turns our to be the fastest mode as well.