Skip to content

description of how reads/writes work in detail#2

Open
calestyo wants to merge 9 commits intoneilbrown:masterfrom
calestyo:description-of-reads-and-writes
Open

description of how reads/writes work in detail#2
calestyo wants to merge 9 commits intoneilbrown:masterfrom
calestyo:description-of-reads-and-writes

Conversation

@calestyo
Copy link
Copy Markdown
Contributor

Adding detailed descriptions of the following to the md(4) manpage:

  • How reads/writes work in detail, especially with respect to the
    minimum/maximum number of bytes that are always fully read/written.
  • How reads/writes work when the array is degraded and or when a rebuild takes
    place.
  • Some general concepts of how the chunksize affect reads/writes.

Signed-off-by: Christoph Anton Mitterer mail@christoph.anton.mitterer.name

Adding detailed descriptions of the following to the md(4) manpage:
* How reads/writes work in detail, especially with respect to the
  minimum/maximum number of bytes that are always fully read/written.
* How reads/writes work when the array is degraded and or when a rebuild takes
  place.
* Some general concepts of how the chunksize affect reads/writes.

Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
@calestyo
Copy link
Copy Markdown
Contributor Author

Same as before: now directly as pull request.

And again, please thoroughly read what I've written and check it for any mistakes!

@calestyo
Copy link
Copy Markdown
Contributor Author

Oh and one more thing:
Are those 4 KiB really hardcoded as such, or is actually the PAGE_SIZE? If the later I'd need to update this.

@neilbrown
Copy link
Copy Markdown
Owner

On Wed, 10 Jul 2013 07:13:46 -0700 Christoph Anton Mitterer
notifications@github.com wrote:

Same as before: now directly as pull request.

And again, please thoroughly read what I've written and check it for any mistakes!


Reply to this email directly or view it on GitHub:
#2 (comment)

Thanks. Again, it would be easier to reply if the patch was in the email.

  • there are lots of typo speeling mistakes. Did you use a spell checker?
    mirroed deivces Fruther

  • Please try to avoid \fB and \fP etc. Use.
    .B bold
    .BR "Bold followed immediately by a colon" :
    etc

  • "reads\ /\ writes" - Is the really necessary? Why note "reads/writes"?

  • You repeat the same parenthetical comment several times:
    obviously,
    +if any of the block layers above is not aligned with MD, even less will at most
    +be read\ /\ written

    which looks and sounds clumsy. Say it once and note that it applies to
    other levels, or something.

  • Similarly you don't need to say "For example MD, dm-crypt ..." more than
    once.

  • data will be IO distributedly read from the devices
    ^^^^^^^^^^^^^^^^^^^^^

Preferred language is English. I don't think above fits.

  • RAID4/5/6 do use "PAGE_SIZE" rather than a literal "4K".
  • are done blocks of
    ^in
  • Saying "Failed devices won’t be used for reads\ /\ writes" is redundant
    even once. Saying it lots of times gets boring.
  • The chunk size has no effect for the non-striped levels LINEAR
    not strictly true. The size of each device is rounded down to a multiple
    of the chunk size.
  • .br should not be used to separate paragraphs. If you want a new
    paragraph, use ".P". If you don't, then don't. That is all.
  • • For very large sequen...
    I'm sure there is a proper syntax for lists in 'man'. See mdadm.8 where I
    use
    .IP (bu 4

In general I think this content is probably useful. It need a bit of
cleaning up first though.

NeilBrown

calestyo added 8 commits July 16, 2013 17:43
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
Signed-off-by: Christoph Anton Mitterer <mail@christoph.anton.mitterer.name>
@MrAta
Copy link
Copy Markdown

MrAta commented Dec 5, 2016

@calestyo would you please give a description of how and where the parities are updated in disks?

neilbrown pushed a commit that referenced this pull request Apr 4, 2025
The test 07reshape-grow fails most of the time. But it succeeds around
1 in 5 times. When it does succeed, it causes the tests to die because
mdadm has segfaulted.

The segfault was caused by mdadm attempting to repoen a file
descriptor that was already closed. The backtrace of the segfault
was:

  #0  __strncmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:101
  #1  0x000056146e31d44b in devnm2devid (devnm=0x0) at util.c:956
  #2  0x000056146e31dab4 in open_dev_flags (devnm=0x0, flags=0)
                         at util.c:1072
  #3  0x000056146e31db22 in open_dev (devnm=0x0) at util.c:1079
  #4  0x000056146e3202e8 in reopen_mddev (mdfd=4) at util.c:2244
  #5  0x000056146e329f36 in start_array (mdfd=4,
              mddev=0x7ffc55342450 "/dev/md0", content=0x7ffc55342860,
              st=0x56146fc78660, ident=0x7ffc55342f70, best=0x56146fc6f5d0,
              bestcnt=10, chosen_drive=0, devices=0x56146fc706b0, okcnt=5,
	      sparecnt=0,  rebuilding_cnt=0, journalcnt=0, c=0x7ffc55342e90,
	      clean=1,  avail=0x56146fc78720 "\001\001\001\001\001",
	      start_partial_ok=0, err_ok=0, was_forced=0)
	                  at Assemble.c:1206
  #6  0x000056146e32c36e in Assemble (st=0x56146fc78660,
               mddev=0x7ffc55342450 "/dev/md0", ident=0x7ffc55342f70,
	       devlist=0x56146fc6e2d0, c=0x7ffc55342e90)
	                 at Assemble.c:1914
  #7  0x000056146e312ac9 in main (argc=11, argv=0x7ffc55343238)
                         at mdadm.c:1510

The file descriptor was closed early in Grow_continue(). The noted commit
moved the close() call to close the fd above the fork which caused the
parent process to return with a closed fd.

This meant reshape_array() and Grow_continue() would return in the parent
with the fd forked. The fd would eventually be passed to reopen_mddev()
which returned an unhandled NULL from fd2devnm() which would then be
dereferenced in devnm2devid.

Fix this by moving the close() call below the fork. This appears to
fix the 07revert-grow test. While we're at it, switch to using
close_fd() to invalidate the file descriptor.

Fixes: 77b72fa ("mdadm/Grow: prevent md's fd from being occupied during delayed time")
Cc: Alex Wu <alexwu@synology.com>
Cc: BingJing Chang <bingjingc@synology.com>
Cc: Danny Shih <dannyshih@synology.com>
Cc: ChangSyun Peng <allenpeng@synology.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Acked-by: Mariusz Tkaczyk <mariusz.tkaczyk@linux.intel.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
neilbrown pushed a commit that referenced this pull request Apr 4, 2025
When we create 100 partitions (major is 259 not 254) in a raid device,
mdadm may coredump:

Core was generated by `/usr/sbin/mdadm --detail --export /dev/md1p7'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  __strlen_avx2_rtm () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
74		VPCMPEQ	(%rdi), %ymm0, %ymm1
(gdb) bt
#0  __strlen_avx2_rtm () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1  0x00007fbb9a7e4139 in __strcpy_chk (dest=dest@entry=0x55d55d6a13ac "", src=0x0, destlen=destlen@entry=32) at strcpy_chk.c:28
#2  0x000055d55ba1766d in strcpy (__src=<optimized out>, __dest=0x55d55d6a13ac "") at /usr/include/bits/string_fortified.h:79
#3  super_by_fd (fd=fd@entry=3, subarrayp=subarrayp@entry=0x7fff44dfcc48) at util.c:1289
#4  0x000055d55ba273a6 in Detail (dev=0x7fff44dfef0b "/dev/md1p7", c=0x7fff44dfe440) at Detail.c:101
#5  0x000055d55ba0de61 in misc_list (c=<optimized out>, ss=<optimized out>, dump_directory=<optimized out>, ident=<optimized out>, devlist=<optimized out>) at mdadm.c:1959
#6  main (argc=<optimized out>, argv=<optimized out>) at mdadm.c:1629

The direct cause is fd2devnm returning NULL, so add a check.

Signed-off-by: Li Xiao Keng <lixiaokeng@huawei.com>
Signed-off-by: Wu Guang Hao <wuguanghao3@huawei.com>
Acked-by: Coly Li <colyli@suse.de>
Acked-by: Coly Li <colyli@suse.de <mailto:colyli@suse.de>>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
neilbrown pushed a commit that referenced this pull request Apr 4, 2025
when we excute mdadm --assemble, udev-md-raid-assembly.rules is triggered.
Then we stop array, we found an coredump for mdadm --incremental.func
stack are as follows:

#0  enough (level=10, raid_disks=4, layout=258, clean=1,
    avail=avail@entry=0x0) at util.c:555
#1  0x0000562170c26965 in Incremental (devlist=<optimized out>,
    c=<optimized out>, st=0x5621729b6dc0) at Incremental.c:514
#2  0x0000562170bfb6ff in main (argc=<optimized out>,
    argv=<optimized out>) at mdadm.c:1762

func enough() use array avail,avail allocate space in func count_active,
it may not alloc space, causing a coredump.We fix this coredump.

Signed-off-by: Guanqin Miao <miaoguanqin@huawei.com>
Signed-off-by: lixiaokeng <lixiaokeng@huawei.com>
Signed-off-by: Jes Sorensen <jes@trained-monkey.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants