`
yzd
  • 浏览: 1816848 次
  • 性别: Icon_minigender_2
  • 来自: 北京
文章分类
社区版块
存档分类
最新评论

神的恩赐

 
阅读更多
<p>希望CSDN的编辑,将这个博客推荐到首页,非常了不起的成果。以下全文转载,来自Maling。</p>
<p><br>The comment from Linus is “The code looks clever and nice”!</p>
<p></p>
<p>a. memcpy in Linux kernel</p>
<p>Patch: <a href="https://patchwork.kernel.org/patch/296282/">https://patchwork.kernel.org/patch/296282/</a></p>
<p>commit id: 59daa706fbec745684702741b9f5373142dd9fdc</p>
<p>First completely avoid memory false dependence in CPU pipeline, which impacts all x86 CPU, the performance is improved up to 3X, pushed into Linux kernel release version, and replaced original one, which stayed for 8 years.</p>
<p></p>
<p>b. memmove in Linux kernel</p>
<p>Patch: <a href="http://lkml.org/lkml/2010/9/16/502">http://lkml.org/lkml/2010/9/16/502</a></p>
<p>commit id: 3b4b682becdfa9f42321aa024d5cc84f71f06d8c</p>
<p>Avoid long latency and some limitation from mov string instruction, which cost much time in decoding stage, and memory false dependence for unaligned cases.</p>
<p> </p>
<p> H.J and I provide the below codes.</p>
<p></p>
<p>a. 64bit memcpy/memmove for Atom, Core2 and Core i7</p>
<p><a href="http://article.gmane.org/gmane.comp.lib.glibc.alpha/15278">http://article.gmane.org/gmane.comp.lib.glibc.alpha/15278</a></p>
<p>This patch includes optimized 64bit memcpy/memmove for Atom, Core 2 and</p>
<p>Core i7. It improves memcpy up to 3X on Atom, up to 4X on Core 2 and</p>
<p>up to 1X on Core i7. It also improves memmove by up to 3X on Atom, up to</p>
<p>4X on Core 2 and up to 2X on Core i7.</p>
<p></p>
<p>b. 64bit memcmp for Core i7</p>
<p><a href="http://sourceware.org/ml/libc-alpha/2010-04/msg00030.html">http://sourceware.org/ml/libc-alpha/2010-04/msg00030.html</a></p>
<p>This is 64bit SSE4 optimized memcmp. It improves memcmp by up to 3Xon Intel Core i7. </p>
<p>c. 64bit strcmp</p>
<p><a href="http://sources.redhat.com/ml/libc-alpha/2009-07/msg00063.html">http://sources.redhat.com/ml/libc-alpha/2009-07/msg00063.html</a></p>
<p>The code is checked in glibc and opensolaris library.</p>
<p></p>
<p>d. 64bit strcpy</p>
<p><a href="http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/gen/strcpy.s">http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libc/amd64/gen/strcpy.s</a></p>
<p>The code is checked in glibc and opensolaris library.</p>
<p></p>
<p>e. 32bit memset/memcpy for Atom, Core2 and Corei7</p>
<p><a href="http://sources.redhat.com/ml/libc-alpha/2010-01/msg00016.html">http://sources.redhat.com/ml/libc-alpha/2010-01/msg00016.html</a></p>
<p>Their performances are all improved up to 3x~4x, pushed into moblin libc successfully.</p>
<p></p>
<p>f. 32bit memcmp/strcmp/strncmp for Atom, Core2 and Corei7.</p>
<p><a href="http://sourceware.org/ml/libc-alpha/2010-02/msg00028.html">http://sourceware.org/ml/libc-alpha/2010-02/msg00028.html</a></p>
<p> The patch is to provide 32bit memcmp/strcmp/strncmp optimized for</p>
<p>SSSE3/SSS4.2. It can improve memcmp by up to 3X, strcmp by up to 7x</p>
<p></p>
<p>本文来自CSDN博客,转载请标明出处:<a href="http://blog.csdn.net/pennyliang/archive/2011/03/30/6288471.aspx">http://blog.csdn.net/pennyliang/archive/2011/03/30/6288471.aspx</a></p>
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics