| Summary: | Add support for memory mapped big file I/O via specialized InputStream and OutputStream, incl. mark/reset | ||
|---|---|---|---|
| Product: | [JogAmp] Gluegen | Reporter: | Sven Gothel <sgothel> |
| Component: | core | Assignee: | Sven Gothel <sgothel> |
| Status: | RESOLVED FIXED | ||
| Severity: | enhancement | ||
| Priority: | --- | ||
| Version: | 2.3.0 | ||
| Hardware: | All | ||
| OS: | all | ||
| Type: | FEATURE | SCM Refs: |
ae17a5895088e321bc373318cc1e144a2f822f29
95c4a3c7b6b256de4293ed1b31380d6af5ab59d0
92a6d2c1476fd562721f231f89afba9342ed8a20
00a9ee70054872712017b5a14b19aa92068c8420
a7a3d5ab98ee0ad33fdef50bf081afeb8295ebe4
bd240ebfe09b7c7a21689dee8be0cc673eb7f340
|
| Workaround: | --- | ||
|
Description
Sven Gothel
2014-09-25 23:42:36 CEST
commit ae17a5895088e321bc373318cc1e144a2f822f29
Add read support for memory mapped big file I/O via specialized InputStream impl., incl. mark/reset
- ByteBufferInputStream simply impl. InputStream for an arbitrary 2MiB restricted ByteBuffer
- Users may only need a smaller implementation for 'smaller' file sizes
or for streaming a [native] ByteBuffer.
- MappedByteBufferInputStream impl. InputStream for any file size,
while slicing the total size to memory mapped buffers via the given FileChannel.
The latter are mapped lazily and diff. flush/cache methods are supported
to ease virtual memory usage.
- TestByteBufferInputStream: Basic unit test for basic functionality and perf. stats.
95c4a3c7b6b256de4293ed1b31380d6af5ab59d0 Fix TestByteBufferInputStream: Handle OutOfMemoryError cause in IOException (Add note to FLUSH_NONE); Reduce test load / duration. 92a6d2c1476fd562721f231f89afba9342ed8a20 Bug 1080 - Add write support for memory mapped big file I/O via specialized OutputStream impl. Added MappedByteBufferOutputStream as a child instance of MappedByteBufferInputStream, since the latter already manages the file's mapped buffer slices. Current design is: - MappedByteBufferInputStream (parent) - MappedByteBufferOutputStream this is due to InputStream and OutputStream not being interfaces, but most functionality is provided in one class. We could redesign both as follows: - MappedByteBufferIOStream (parent) - MappedByteBufferInputStream - MappedByteBufferOutputStream This might visualize things better .. dunno whether its worth the extra redirection. +++ MappedByteBufferInputStream: - Adding [file] resize support via custom FileResizeOp - All construction happens via ctors - Handle refCount, incr. by ctor and getOutputStream(..), decr by close - Check whether stream is closed already -> IOException - Simplify / Reuse code MappedByteBufferOutputStream: - Adding simple write operations Basic functionality now added incl. unit tests passed on Windows and GNU/Linux 32- and 64bit using JRE7 and JRE8 (Oracle/OpenJDK). Further refinements may happen via a followup bug report. To render the MappedByteBuffer*Stream more useful, we might add JNI native mmap and munmap ? This would enhance 'flushing' of a mapped buffer slice and hopping to the next. Right now, we use an array of slices, but native mmap/munmap could remove such use, map the current 'window' directly and also ensuring the unmap and hence release. Currently the unmap is only impl. in a fuzzy way, i.e. via GC or private 'cleaner' method. +++ Also a r/w method using ByteBuffers might seem useful as well. +++ commit 00a9ee70054872712017b5a14b19aa92068c8420 Refine MappedByteBuffer*Stream impl. and API [doc], adding stream to stream copy as well as direct memory mapped ByteBuffer access a7a3d5ab98ee0ad33fdef50bf081afeb8295ebe4
- Validate active and GC'ed mapped-buffer count
in cleanAllSlices() via close() ..
- Fix missing unmapping last buffer in notifyLengthChangeImpl(),
branch criteria was off by one.
- cleanSlice(..) now also issues cleanBuffer(..) on the GC'ed entry,
hence if WeakReference is still alive, enforce it's release.
- cleanBuffer(..) reverts FLUSH_PRE_HARD -> FLUSH_PRE_SOFT
in case of an error.
- flush() -> flush(boolean metaData) to expose FileChannel.force(metaData).
- Add synchronous mode, flushing/syncing the mapped buffers when
in READ_WRITE mapping mode and issue FileChannel.force() if not READ_ONLY.
Above is implemented via flush()/flushImpl(..) for buffers and FileChannel,
as well as in syncSlice(..) for buffers only.
flush*()/syncSlice() is covered by:
- setLength()
- notifyLengthChange*(..)
- nextSlice()
Always issue flushImpl() in close().
- Windows: Clean all buffers in setLength(),
otherwise Windows will report:
- Windows: Catch MappedByteBuffer.force() IOException
- Optimization of position(..)
position(..) is now standalone to allow issuing flushSlice(..)
before gathering the new mapped buffer.
This shall avoid one extra cache miss.
Hence rename positionImpl(..) -> position2(..).
- All MappedByteBufferOutputStream.write(..) methods
issue syncSlice(..) on the last written current slice
to ensure new 'synchronous' mode is honored.
+++
Unit tests:
- Ensure test files are being deleted
- TestByteBufferCopyStream: Reduced test file size to more sensible values.
bd240ebfe09b7c7a21689dee8be0cc673eb7f340 MappedByteBufferInputStream: Default CacheMode is FLUSH_PRE_HARD now (was FLUSH_PRE_SOFT) FLUSH_PRE_SOFT cannot be handled by some platforms, e.g. Windows 32bit. FLUSH_PRE_HARD is the most reliable caching mode and it will fallback to FLUSH_PRE_SOFT if no method for 'cleaner' exists. Further, FLUSH_PRE_HARD turns our to be the fastest mode as well. |