Become a leader in the IoT community!

New DevHeads get a 320-point leaderboard boost when joining the DevHeads IoT Integration Community. In addition to learning and advising, active community leaders are rewarded with community recognition and free tech stuff. Start your Legendary Collaboration now!

Step 1 of 5

CREATE YOUR PROFILE *Required

Change Email
OR
Step 2 of 5

WHAT BRINGS YOU TO DEVHEADS? *Choose 1 or more

Collaboration & Work 🤝
Learn & Grow 📚
Contribute Experience & Expertise 🔧
Step 3 of 5

WHAT'S YOUR INTEREST OR EXPERTISE? *Choose 1 or more

Hardware & Design 💡
Embedded Software 💻
Edge Networking
Step 4 of 5

Personalize your profile

Step 5 of 5

Read & agree to our COMMUNITY RULES

  1. We want this server to be a welcoming space! Treat everyone with respect. Absolutely no harassment, witch hunting, sexism, racism, or hate speech will be tolerated.
  2. If you see something against the rules or something that makes you feel unsafe, let staff know by messaging @admin in the "support-tickets" tab in the Live DevChat menu.
  3. No age-restricted, obscene or NSFW content. This includes text, images, or links featuring nudity, sex, hard violence, or other graphically disturbing content.
  4. No spam. This includes DMing fellow members.
  5. You must be over the age of 18 years old to participate in our community.
  6. Our community uses Answer Overflow to index content on the web. By posting in this channel your messages will be indexed on the worldwide web to help others find answers.
  7. You agree to our Terms of Service (https://www.devheads.io/terms-of-service/) and Privacy Policy (https://www.devheads.io/privacy-policy)
By clicking "Finish", you have read and agreed to the our Terms of Service and Privacy Policy.

Optimizing memcpy Performance on Intel Core i7 10700K: SIMD and Compiler Flags

I am analyzing the performance of `memcpy` on an Intel Core i7 10700K CPU , using GCC 10.2 on Linux kernel 5.10. My assumption is that its speed should be close to the time it takes to transfer one long multiplied by the number of longs being copied. Could `memcpy` be optimized to exceed this expectation, possibly using `SIMD` or other CPU specific features?

Are there any compiler flags or hardware optimizations I should be aware of to get the best performance out of `memcpy`?

CONTRIBUTE TO THIS THREAD

Browse other Product Reviews tagged 

Leaderboard

RANKED BY XP

All time
  • 1.
    Avatar
    @Nayel115
    1620 XP
  • 2.
    Avatar
    @UcGee
    650 XP
  • 3.
    Avatar
    @melta101
    600 XP
  • 4.
    Avatar
    @lifegochi
    250 XP
  • 5.
    Avatar
    @Youuce
    180 XP
  • 6.
    Avatar
    @hemalchevli
    170 XP